Best sql-server questions in April 2011

When is it better to store flags as a bitmask rather than using an associative table?

23 votes

I’m working on an application where users have different permissions to use different features (e.g. Read, Create, Download, Print, Approve, etc.). The list of permissions isn’t expected to change often. I have a couple of options of how to store these permissions in the database.

In what cases would Option 2 be better?

Option 1

Use an associative table.

User
----
UserId (PK)
Name
Department
Permission
----
PermissionId (PK)
Name
User_Permission
----
UserId (FK)
PermissionId (FK)

Option 2

Store a bitmask for each user.

User
----
UserId (PK)
Name
Department
Permissions
[Flags]
enum Permissions {
    Read = 1,
    Create = 2,
    Download = 4,
    Print = 8,
    Approve = 16
}

Splendid question!

Firstly, let's make some assumptions about "better".

I'm assuming you don't much care about disk space - a bitmask is efficient from a space point of view, but I'm not sure that matters much if you're using SQL server.

I'm assuming you do care about speed. A bitmask can be very fast when using calculations - but you won't be able to use an index when querying the bitmask. This shouldn't matter all that much, but if you want to know which users have create access, your query would be something like

select * from user where permsission & CREATE = TRUE

(haven't got access to SQL Server today, on the road). That query would not be able to use an index because of the mathematical operation - so if you have a huge number of users, this would be quite painful.

I'm assuming you care about maintainability. From a maintainability point of view, the bitmask is not as expressive as the underlying problem domain as storing explicit permissions. You'd almost certainly have to synchronize the value of the bitmask flags across multiple components - including the database. Not impossible, but pain in the backside.

So, unless there's another way of assessing "better", I'd say the bitmask route is not as good as storing the permissions in a normalized database structure. I don't agree that it would be "slower because you have to do a join" - unless you have a totally dysfunctional database, you won't be able to measure this (whereas querying without the benefit of an active index can become noticably slower with even a few thousand records).

Interesting SQL puzzle

10 votes

Without loops or cursors, how do you take a list of date intervals and turn them into a string of 1s and 0s such that:

  • each bit represents each day from min(all the dates) to max(all the dates)
  • the bit is 1 if that day falls inside any of the date intervals
  • the bit is 0 if that day does not fall in any of the intervals

So for example, if the intervals were:

  • 1/1/2011 to 1/2/2011
  • 1/4/2011 to 1/5/2011

Then the SQL you write should output 11011. Here is a setup script you could use:

declare @TimeSpan table
(
    start datetime
    ,finish datetime
)

-- this is a good data set, with overlapping and non-overlapping time spans
insert into @TimeSpan values ('02/02/2010', '02/02/2010')
insert into @TimeSpan values ('02/03/2010', '02/03/2010')
insert into @TimeSpan values ('02/04/2010', '02/05/2010')
insert into @TimeSpan values ('02/05/2010', '02/06/2010')
insert into @TimeSpan values ('02/07/2010', '02/09/2010')
insert into @TimeSpan values ('02/08/2010', '02/08/2010')
insert into @TimeSpan values ('02/08/2010', '02/10/2010')
insert into @TimeSpan values ('02/14/2010', '02/16/2010')

-- for this set of data, the output string would be 111111111000111

DECLARE @Result VARCHAR(MAX), @start DATETIME

SELECT @start= MIN(start) ,
       @Result =REPLICATE('0',1+DATEDIFF(DAY,MIN(start),MAX(finish)))
FROM @TimeSpan

SELECT @Result = STUFF(@Result,
                       DATEDIFF(DAY,@start,start)+1,
                       DATEDIFF(DAY,start,finish)+1,
                       REPLICATE('1',1+DATEDIFF(DAY,start,finish)))
FROM @TimeSpan 

SELECT @Result       

Not equal <> != operator in T-SQL on NULL

7 votes

Could someone please explain the following behavior in SQL?

SELECT * FROM MyTable WHERE MyColumn != NULL (0 Results)
SELECT * FROM MyTable WHERE MyColumn <> NULL (0 Results)
SELECT * FROM MyTable WHERE MyColumn IS NOT NULL (568 Results)

<> is Standard SQL-92, != is it's equivalent. Both evaluate for values, which NULL is not -- NULL is a placeholder to say there is the absence of a value.

Which is why you can only use IS NULL/IS NOT NULL to evaluate for such situations.

And it's not specific to SQL Server. All standards-compliant SQL dialects work the same way.

How do I group on continuous ranges

7 votes

I know some basic sql, but this one is beyond me. I have looked high and low but no dice. I need a view of the following data, I can do this in the application layer code. But unfortunately for this particular one, the code must be put in the data layer.

I am using T-SQL.

Table

Date      Crew       DayType
01-02-11  John Doe  SEA  
02-02-11  John Doe  SEA  
03-02-11  John Doe  SEA  
04-02-11  John Doe  HOME  
05-02-11  John Doe  HOME  
06-02-11  John Doe  SEA 

I need a view like this

DateFrom  DateTo    Name      DayType
01-02-11  03-02-11  John Doe  SEA
04-02-11  05-02-11  John Doe  HOME
06-02-11  06-02-11  John Doe  SEA

Unfortunately the base table is required for application layer to be in the format show. Is this possible to do in query?

Thanks

Luke

WITH    q AS
        (
        SELECT  *,
                ROW_NUMBER() OVER (PARTITION BY crew, dayType ORDER BY [date]) AS rnd,
                ROW_NUMBER() OVER (PARTITION BY crew ORDER BY [date]) AS rn
        FROM    mytable
        )
SELECT  MIN([date]), MAX([date]), crew AS name, dayType
FROM    q
GROUP BY
        crew, dayType, rnd - rn

This article may be of interest to you:

Two separate instances of SQL Server running a different explain plan

7 votes

Here's one I need help from the SQL administrators out there. I have two separate SQL Server instances on Amazon EC2. One is our staging environment, and the other is our production environment, but they are configured exactly the same way (spawned from the same image).

We had a database that we copied from staging to our production environment last week. The way we copy a db to production is we take a backup of it on our staging site, and restore the backup in production. Anyways, we found that in production, one particular complex query was timing out after an hour, but that exact query in our staging environment completed in 10 minutes.

The explain plan on both were almost the same, except in one server it was doing a PK scan on a large table (8M rows), and on the other table it was doing an index seek. We're assuming this was the difference. So one server was doing a lot of disk IO, and the other was not.

So my question is, what are the reasons that one installation of SQL server would decide to use an index, while another one ignores it--assuming same versions of SQL server, and same data set? Even better, what are the best ways to find out why SQL is ignoring an index?

This was our mistake.

After much digging investigation, we found that one of our devs had added a couple additional indexes to the production db after the transfer. This was a case where the additional indexes actually caused the query optimizer to pick a less efficient route in the production environment.

Removing those additional indexes appeared to have addressed the performance issue for the particular query, and both explain plans are now the same.

What do you do when your primary key overflows?

7 votes

We have a table, with an auto-increment int primary key, whose max value is now at the limit for the T-SQL int type. When we try to re-seed the table (because there are large gaps in the keys, nowhere near enough rows as the max int value), it somehow keeps getting reset to the max int value.

Obviously, this causes serious problems. The PK never should have gotten to this value and changing the data type would be a big task. Re-seeding should be sufficient, but it's not working!

How could it keep getting reset?

Edit: to clarify the situation, this query SELECT MIN(CategoryID), MAX(CategoryID) FROM dbo.tblCategories returns -2147483647, 2147483647... meaning that there are actual PK values at the min and max of the int type.

You can "reseed" a table so that the next-assigned identity column will be less than the current max value in the table. However, any subsquent DBCC CHECKIDENT will reset the internal counter back to the max value currently in the column. Perhaps that's where the reset is coming from? And of course an insert will eventually hit a duplicate value, resulting in Interesting Times for the production support crew.

By and large, you're in trouble. I recommend working up a one-time script to remove/reset the uber-high ID values. Updating the rows (and all related foreign key values) is one option, though it would involve having to disable the foreign key constraints and I don't know what all else, so I wouldn't recommend it. The other would be to create exact copies of all the data for the "high-id" items using more pragmatic Id values, and then delete the original entries. This is a hack, but it would result in a much more maintainable database.

Oh, and track down the folks who put those high-id values in, and--at the very least--revoke their access rights to the database.

SQL Server 2005 Transactional Replication Fails to Publish Stored Procedure Containing an Index Create

6 votes

I've experienced a bizarre problem with a SQL Server 2005 Transactional Publication. The issue is this: If the publication contains an article that is a stored procedure that contains a create index statement, then there is an error thrown when attempting to replicate the schema of the stored procedure to a subscriber.

The behavior is very odd, because even if the create index statement is commented out, it still gives the exception, and it will only work if it is removed altogether.

Here is the exact error that's being returned:

Command attempted: GRANT EXECUTE ON [dbo].[usp_Test] TO [CompanyDatabase_access]

(Transaction sequence number: 0x00000170000008B9000500000000, Command ID: 5)

Error messages: Cannot find the object 'usp_Test', because it does not exist or you do not have permission. (Source: MSSQLServer, Error number: 15151) Get help: http://help/15151 Cannot find the object 'usp_Test', because it does not exist or you do not have permission. (Source: MSSQLServer, Error number: 15151) Get help: http://help/15151

The error is accurate, because when I check on the subscriber, the stored procedure wasn't created as expected... but that was the purpose of the publication...

Additionally, I can create the stored procedure manually on the subscriber, but when I generate a snapshot, it deletes the existing stored procedure and then still returns this error message.

And here's a sample publication that creates this issue.

The stored procedure:

USE [CompanyDatabase]
GO

CREATE PROCEDURE [dbo].[usp_Test]

AS

CREATE TABLE #TempTable(ID INT)
CREATE NONCLUSTERED INDEX [IX_TempTable] ON [dbo].[#TempTable](ID)
SELECT 'Test'
GO

GRANT EXECUTE ON [dbo].[usp_Test] TO [CompanyDatabase_access]
GO

The publication script:

-- Adding the transactional publication
use [CompanyDatabase]
exec sp_addpublication 
    @publication = N'Replication Test', 
    @description = N'Publication of database ''CompanyDatabase''.', 
    @sync_method = N'concurrent', 
    @retention = 0, 
    @allow_push = N'true', 
    @allow_pull = N'true', 
    @allow_anonymous = N'false', 
    @enabled_for_internet = N'false', 
    @snapshot_in_defaultfolder = N'true', 
    @compress_snapshot = N'false', 
    @ftp_port = 21, 
    @ftp_login = N'anonymous', 
    @allow_subscription_copy = N'false', 
    @add_to_active_directory = N'false', 
    @repl_freq = N'continuous', 
    @status = N'active', @independent_agent = N'true', 
    @immediate_sync = N'false', 
    @allow_sync_tran = N'false', 
    @autogen_sync_procs = N'false', 
    @allow_queued_tran = N'false', 
    @allow_dts = N'false', 
    @replicate_ddl = 1, 
    @allow_initialize_from_backup = N'false', 
    @enabled_for_p2p = N'false', 
    @enabled_for_het_sub = N'false'
GO

-- Adding the transactional articles
use [CompanyDatabase]
exec sp_addarticle 
    @publication = N'Replication Test', 
    @article = N'usp_Test', 
    @source_owner = N'dbo', 
    @source_object = N'usp_Test', 
    @type = N'proc schema only', 
    @description = N'', 
    @creation_script = N'', 
    @pre_creation_cmd = N'drop', 
    @schema_option = 0x0000000048000001, 
    @destination_table = N'usp_Test', 
    @destination_owner = N'dbo', 
    @status = 16
GO

-- Adding the transactional subscriptions
use [CompanyDatabase]
exec sp_addsubscription 
    @publication = N'Replication Test', 
    @subscriber = N'OtherDatabaseServer', 
    @destination_db = N'CompanyDatabase', 
    @subscription_type = N'Pull', 
    @sync_type = N'automatic', 
    @article = N'all', 
    @update_mode = N'read only', 
    @subscriber_type = 0
GO

The subscription script:

/****** Begin: Script to be run at Subscriber ******/
use [CompanyDatabase]
exec sp_addpullsubscription 
    @publisher = N'DatabaseServer', 
    @publication = N'Replication Test', 
    @publisher_db = N'CompanyDatabase', 
    @independent_agent = N'True', 
    @subscription_type = N'pull', 
    @description = N'', 
    @update_mode = N'read only', 
    @immediate_sync = 0

exec sp_addpullsubscription_agent 
    @publisher = N'DatabaseServer', 
    @publisher_db = N'CompanyDatabase', 
    @publication = N'Replication Test', 
    @distributor = N'DatabaseServer', 
    @distributor_security_mode = 1, 
    @distributor_login = N'', 
    @distributor_password = N'', 
    @enabled_for_syncmgr = N'False', 
    @frequency_type = 64, 
    @frequency_interval = 0, 
    @frequency_relative_interval = 0, 
    @frequency_recurrence_factor = 0, 
    @frequency_subday = 0, 
    @frequency_subday_interval = 0, 
    @active_start_time_of_day = 0, 
    @active_end_time_of_day = 235959, 
    @active_start_date = 0, 
    @active_end_date = 0, 
    @alt_snapshot_folder = N'', 
    @working_directory = N'', 
    @use_ftp = N'False', 
    @job_login = null, 
    @job_password = null, 
    @publication_type = 0
GO
/****** End: Script to be run at Subscriber ******/

Again, the odd thing is that the publication will still contain the same error if the create index statement is commented out, but it will work if it is removed altogether.

For now, I've just removed all stored procedures that contain these create index statements from the publication, but I would like to have them replicated to the subscribers so that any DDL updates to the procedures will be automatically reflected on the subscribers.

-- EDIT --

Looking in the snapshot directory, the .sch file for usp_Test contains the exact same code block I previously posted for the stored procedure... based on the error returned, it seems like the snapshot agent decides not to run the CREATE PROCEDURE command if it contains a create index, but then continues on and tries to run the GRANT EXECUTE command, which causes the error.

Also, my exact version of SQL Server is:

Microsoft SQL Server 2005 - 9.00.5254.00 (2005 + SP4 Cumulative Update 1)

-- END EDIT --

My question is, why is this happening? Is there an issue with the configuration of my publication or subscription? As anyone else experienced anything like this? Where would I start in troubleshooting this issue?

-- UPDATE --

I've been talking to Hilary Cotter on technet... and still no luck. If I remove the GRANT EXECUTE permission on the procedure, then it creates successfully with the CREATE INDEX. So it will work with GRANT EXECUTE OR CREATE INDEX, but not both. Hilary suggested that it might be some type of spam appliance in my domain that was preventing the snapshot from being transferred correctly when it contained both of those keywords, but if I manually copy the .sch file to the subscriber and validate that it contains the expected commands, I still get the same issue.

With the following code: the stored procedure in the snapshot will failed to be applied:

CREATE NONCLUSTERED INDEX [IX_TempTable] ON [dbo].[#TempTable](ID)

But, changing the syntax slightly causes the stored procedure to create without issue:

ALTER TABLE dbo.#TempTable ADD CONSTRAINT IX_TempTable UNIQUE NONCLUSTERED ( ID )

I can't explain it, and after spending literally hours on this glitch, I'm ready to just stop looking for an explanation and settle for this workaround.

Insert row in DB while using multithreads?

6 votes

Hi Expert,

Here I am using multi threading and linq to sql.

Here I upload my code snippet:

public class PostService
{ 
    MessageRepository objFbPostRespository = new MessageRepository();
    public void callthreads()
    {
        for (int i = 0; i < 100; i++)
        {
            Thread th = new Thread(postingProcess);
            th.Start();
        }
    }

    public void postingProcess()
    {
        objFbPostRespository.AddLog("Test Multithread", DateTime.Now);
    }
}

Message Repository class

class MessageRepository
{        
    DataClassesDataContext db_Context = new DataClassesDataContext();
    public void AddLog(string Message, DateTime CurrentDateTime)
    {
        FbMessgaeLog FbMessage = new FbMessgaeLog
        {
            Message = Message,
            Time = CurrentDateTime                
        };
        db_Context.FbMessgaeLogs.InsertOnSubmit(FbMessage);
        db_Context.SubmitChanges();
    }
}

When I run it without threads then it's work fine after include thread I was got following error msg:

Error: An item with the same key has already been added.

Thanks in advance...:)

You cannot use a LINQ DataContext in concurrent fashion:

Any instance members are not guaranteed to be thread safe.

Therefore you need to either serialize access (lock) which will be horribly inefficient, or better use a separate context in each thread:

public class PostService
{ 
    public void callthreads()
    {
        for (int i = 0; i < 100; i++)
        {
            Thread th = new Thread(postingProcess);
            th.Start();
        }
    }

    public void postingProcess()
    {
        using (MessageRepository objFbPostRespository = new MessageRepository())
        {
           objFbPostRespository.AddLog("Test Multithread", DateTime.Now);
        }
    }
}

I also hope, for your own sake, that your test has actual logic to wait for the test threads to complete before shutting down... And, of course, properly implement IDisposable in your repository and dispose the context so that the DB connection get placed back in the pool.

SQL Server selecting a string from table using in clause

6 votes

Im having a strange SQL server issue. using the following query:

SELECT id FROM table WHERE id IN ('id1', 'id2', .......)

when id is nchar(30) and 'id1','id2',.... are values i get a result which isnt in the values i entered.

Is it possible that sql server is searching for a string contained in the values?????

Added

query:

SELECT Word FROM WordDictionary WHERE Word IN ('DESPERADO', 'WWW.MYSAVINGS.COM', 'RELIED', 'GALS/GUYS....U', 'MISSOULA', 'STARING...WHY', 'OHIO,,,WHAT', 'ALEYO"MEANS', 'EXCRETE', 'POETERS', 'REMOVAL?IF', 'MOTOT', 'VIEW/SOUND', 'SCHOLD', 'FLINGS', '300000', 'BIGBANG', 'INVOKE', 'COMPLIER', 'UPNISHAD', 'FLUFF/LINT', 'DONATED?..PLEASE', 'EPHEDRINE', 'AGAIN-', 'WHUNT', 'LEVE', 'ARIEL', 'SEIZURES,AND', 'ANYON', 'WELL~AS', 'GGGGGGGGOOOOOOOOOOOOOOOOODDDDD', 'ALGERIA', 'LONDON...CAN', 'TWAIN''S', 'BUTIFUL', 'CIRRHIOSIS', 'PHP-NUKE', 'SCREWD', 'RECONNECT', 'BAND...''SIGUR', 'ROS''', 'DEFLEPOARD', 'FIHGT', 'DRE''S', 'ACQUAINTED', '77067', 'INCREASE/DECREASE', 'AWHILE..SHOULD', 'BABY???..MORE', 'CHRISTEN', 'SUNSLIFE', 'HYANCINTHS', 'NOVEMEBER', 'IEEE', 'IRENE', '5"4', 'BAYSIDE', 'DOJO', 'PEOPLES::DO', 'INFORMATION/ANSWER', 'BLACKWORM', 'MYWIFE.D', '42D', 'COLONEL', 'ESCAPES', 'KW', 'WASH/CLENSE', 'ENCOURAGES', 'HOLINESS', '4710', 'MONOATOMIC', 'FORM-', 'NAVIGATIONS', 'ASHLIEY(TWINS', 'ALIAS....WHAT', 'MARIOKART', 'HORNYNESS', 'CONVERSIONS', 'NUIT', 'PARISTEL:0660442290PL06', 'PUSSY', 'WILLOWS', 'BOYFRIEND/BABYDADDY', 'PARASITES', 'TABOILD', 'J.T', 'TERESEA', '---FREE---', 'KAMORA', 'SIMONS', 'FORSYTHIA', 'RAZORTHOUGHT', 'ABSINTHE', '9-3', 'BAIT-CASTING', 'CUMULATIVE', 'HELP>>>', 'MATZO', 'LIMOSINE', 'SCD353', 'BANGARAM', 'BRUNEL', 'KWTV878', 'NEAPOLITAN', 'OFYOUR', '2SIN', '²', '3SINX', 'IMPERFECTION', 'NONBELIEF', 'FLEM', 'NON-ADJACENT', 'WASHINGTION', 'WHERE/IF', 'BRONTE''S', 'WUTHERING', 'SOMEONE/A', 'TEAM.WHAT', 'PRESIDENT,WHEN', 'DIRICHLET', 'X-AND', 'Y-INTERCEPTS', 'STAMPED', 'PROCRDURE', 'AK32', '*67', 'HANUKKAH', 'MONIE', 'TAGAYTAY', 'NATURES', 'HASS', 'TORMENTS', 'PROPOTIONAL', 'SUDERLAND', 'CONROL', 'CONSEQUENCE', 'SAW?YOU', 'WITHDREW', 'PMT', 'JAIL?WAT', 'DEFFEND', '-12>8X', '4X>6', 'MX-C550', '6-DISC', 'SVQ3', 'BULLSHITING', 'PWEAZE', '23SECONDS', 'VISHWANATHAN''S', 'INTERNALIZING', 'MCCAFFERTY''S', 'TODAY...AND', 'CHANCE....WOULD', 'DEC.''41', '''45', 'HAILLE', 'SELASSIE', 'OF...GREENDAY', 'DEAD/ARMY', 'EX-NFL', 'JACKSONVILLE,FL', 'ATLANTA,HOUSTON,OR', 'ECCENTRICITY', 'CONIC', 'XXX@YAHOO.CO.UK', 'XXX@YAHOO.COM', 'DISCRET', '_______', 'ROMACE', 'SUBCATEGORY', 'REDUDUCE', 'EXERCISER', 'MUNITE', 'MESSENGER.SO', 'NIGEL', 'PLANER', 'QUESTION?31576*66496139', 'KODJOE', '919', '1847', 'D.WADE', 'HUMAN''S', 'MULTI-NATIONAL', 'GOGGLE', 'GAAP', 'CONFUSED.IONT', 'ST8', 'ROOM/HOME', 'BOLB', 'GRANDMA-', 'PARSON', 'BELIZE', 'UNITY', 'AWARDS''', 'TOGHTHER', 'LONDON+GREATER', 'JERSEY...IE', 'NETGAR', 'NBC,ABC', 'CONON', 'RECIDENT', 'CANCERS', 'PITTSBURGH/INDY', 'CREATETH', 'MUSICAL''S', 'HEELLLLPPPP', 'MASRER', 'NAME,AND', 'ANAEROBIC', 'SPACIAL', 'SPOUSE/SIGNIFICANT', 'TRIGNOCEPHALY', 'RAW''S', 'BLOGGIN', '9.2', 'FLATTENING', 'FLOWER,ANIMALAND', 'EXPRESSES', 'FRDS', 'NOT?PLS', 'CLEARIFY', 'CLEARFIELD/JEFFERSON', 'HACE', 'FELICIANO', 'MEDICINE--THEY', 'DASCHUND', 'PLINTER', 'SKETCHY', 'I..WHAT', 'JUVENTINA', 'SOMUCH', 'SHEEN', 'HALEX', '11-IN-1', 'URBANIZATION', 'WILWOOD', 'CALIPER', 'NERVE-RACKING', 'OBSESESSION', 'EZRA', 'TALBOT', 'SHOCKWAVE', 'PASCO', '300$-600$', '108,000', '*BOYZ', 'ONLY*PLZ', 'INTERNET)CAN', 'CONCISE', 'TOP40', 'HICUPPS', '4:00', 'OPOSITE', 'NETWORKER?=', 'Q-LINK', 'HSG', 'AMINE', 'RIGHTS,,HOW', 'EMILIANO', 'PEDREGON', 'DILEMA', 'GROUPTHINK', 'MONTEAL', '17...&', '13:04', 'NASHIK', 'NOBIA', 'LINEWIRE', 'ISOCKS', 'DAY........WAIT', 'KATY', 'BODERLINE', 'CONNORS', 'WHWRE', 'CROMWELL', 'COE', '1+1=', 'UMMM.....WHATS', 'BOND''S', 'VIEWLOADER', 'MAXIS', 'MAT_LOVE83@YAHOO.COM', 'SCISSORS', 'UNSANITARY', 'KANSANS', 'SALINA', '''ARENA''', '''CAROLINA''', 'BIOMAGNIFICATION', 'BIOASSIMILATION', 'WOMBATS', 'POOS', 'ARSES', 'SOCIALIZATION', 'GROUPS,RACES', 'FULL-BLOOD', 'TASMANIAN', 'INCLINE', 'PICA', 'FIGGERED', '9.HE', 'RETINOBLASTOMA,IS', 'COAT''S', '2MOWRO', 'DOTHAN', 'DIFRNT', 'DEPICTS', 'WHAK', 'NETHERLAND', 'FORTHE', 'OFMICROSOFT', 'INDIGO', 'I-MAC', 'SHANGHAINESE', 'DINOSAOUR', 'SUBCAMPS', 'CARDINAL', 'NEBRASKA''S', 'KOMO', 'PIRATE/SAILING', 'ZOPICLONE', 'CRYPTIC', 'CLUE"COULD', 'WHEELCHAIR,NO', 'OVERWIEGHT.DOES', 'ALL..MY', 'BOYFRIIEND', 'MAGTECH', 'TROUBLES...ANY', 'BACK,REDUCED', 'JEWELY', 'CRAFTSMEN''S', 'HAUNTER', 'GENGAR', 'CRYSTAL.WHERE', 'LAHORE', 'SANDLER''S', 'ACCENT...WHAT', 'WOOOOOAAAAAAAAH', 'W''T', 'RAJASEKARAN_NIKIL@YAHOO.COM', 'SPEICES', 'MCFLY', 'BIOCONVERSION', 'GUERRERO', 'CATHOLICS......WHAT', 'NOV.1--', 'ECONOLINE', 'AMOVIE', 'COUNSELLING', 'HANDSPAN', 'ATTIUTE', 'HAIR??HELP', 'PLASTIC?CERAMIC', 'TITLEIST', 'REISDENTS', '7''S', 'FERMATS', 'JBW&CWW', 'RB''S', 'KENYON', 'BAPTIST''S', 'THUMPERS', 'THOZ', 'HATZ', 'MSNISMS', 'POLL/SURVEY', 'INFUSION', 'FUNDRAISER', 'PROTECTS', 'ANTOINO', 'SYALLBUS', 'GCSES', 'SPIDER,SNAKE,DOG,CAT', 'KNOW.PLEASE', 'CHACH', 'DISSAPIONTED', 'TODGER', 'SH*TTING', 'LODESTONE', 'SARBANES', 'OXLEY', 'ANOTHR', 'RELATIONSHIP????HOW', 'T9', 'JIGGLY', 'GOOD?IS', 'HARDEN', 'DESERT?I', 'SIGNIFANT', 'WEDO', 'SCHAT', 'LQ', 'TENCHI', 'ME...WHAT''S', 'ERUPTED', '£40', '£150', 'UPGRATE', 'I500', '2003SE', 'PROMOT', 'SALUTES', 'GRAEME', 'SOUNESS', 'SERRONE', 'AHAVE', 'BUSTSA', 'IMPARTIAL', 'SUGARCULT', 'RFID', 'SWIPE', '30X', 'HOUSE''S', 'BI-WAY', 'BLYTHE,CA', 'REDDER', 'PLUMPER', 'LEFTWINGERS', 'WHINGE', 'ANNOYING..!!1', 'SENSORY', 'ADHD/ODD', 'ZYBAN', 'RAMP', 'SUB-WOOFER''S', 'TATTOOIST', '477', 'SOFISTCATED', 'B/J', 'ALLURE', 'THIS?(SEE', 'PAKISTANIS:DO', 'PLEASE?I', 'CLASHING', 'KNOW(DOCS', 'WHITESMOKE', 'SCREEN''S', 'SPINK', 'YOUI', 'DELICATE', 'MISDIAGNOSED', 'DIPERNO', 'HIM.PLEASE', '45CM', '35CM', 'ENOUGHT', 'DECIMETERS?PUT', 'RDMB', 'HOMELOANS', 'HAA', 'ORIGIANL', 'RESTON', 'ZINNIAS', 'PERENNIAL', 'WHOOPIE', 'CUSHIONS', 'POOFS', 'CAT,ITS', 'TIME,ITS', 'TUXEDOS', 'CHICKAN', 'WHISLE', 'RUMBLING', '7LEVELS', 'YOUFROM', 'CEMO', 'RECURRENT', 'LARYNGEAL', 'W/NO', 'SOUCRE', 'COMMA', 'WORDCUB', 'WOMEN:HAVE', 'PIGGYBACK', 'ANOLE', 'CHRISTIANSEN', 'SWEEPSTAKES,GETTING', 'WON.HOW', 'EUDORA5.1', 'EXPRESS,NETSCAPE4.X', '6.X', '''COURT', 'WISK''', '4/13', '50-70', 'DAUGHTER,CAN', 'UVULA', '''AYURVEDA''', 'ACUPUNTURE', 'BAJA', 'REASONIBLE', 'PUZZY', 'SORCERER', 'DISEASES/SICKNESS', 'ANUS/RECTUM?CAN', 'HOME?WHAT', 'COCETH', 'KELLOGGS', 'DAISUKE')

result:

2

Added

CREATE TABLE [dbo].[WordDictionary](
[Word] [nchar](30) NOT NULL,
[Count] [int] NOT NULL,
    CONSTRAINT [PK_WordDictionary] PRIMARY KEY CLUSTERED 
    (
      [Word] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,     
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    ) ON [PRIMARY]

Your IN list contains the item '²'.

I'll be very surprised if that isn't the source of the issue (though it doesn't actually match for me under my default collation)

Sql Server Performance And Order Of Fields

6 votes

does the order of fields creation in a table effect on the performance of commands on the table? If the answer is yes, can anyone discuss it?

For example i have create a table like this

create table Software(int id,alpha datetime,beta datetime,title nvarchar(100),stable datetime,description nvarchar(200) )

if i change it to

create table Software(int id,alpha datetime,beta datetime,stable datetime,description nvarchar(200),title nvarchar(100) )

Is there any performance effect ?

Is it clear?

The field order makes no difference whatsoever (if the fields are always the same of course)

The on-disk structure will remain the same pretty much regardless. Simply:

  • header
  • fixed length columns
  • null bitmap
  • variable length columns

All you're doing above is rearranging some columns inside the "fixed length" and "variable length" sections. However, the same processing is required to retrieve them no matter which order they are in.

See Paul Randal's article

Understanding COMPATIBILITY_LEVEL in SQL Server

5 votes

I understood that setting a database to a COMPATIBILITY_LEVEL prior to your native one prevented features from being used. However this doesn't seem to be the case. Witness the following SQL script:

CREATE DATABASE Foo
GO
USE Foo
GO
ALTER DATABASE Foo SET COMPATIBILITY_LEVEL = 80
GO

CREATE TABLE Bar
(
    Id UNIQUEIDENTIFIER NOT NULL,
    TestNvcMax NVARCHAR (MAX) NOT NULL, -- Arrived in SQL 2005
    TestDateTime2 DATETIME2 (7) NOT NULL -- Arrived in SQL 2008
)
GO

But this table creates perfectly - any ideas? I would have thought some kind of an error message or warning would have been appropriate

Here you can read about the differences between compatibility level 80, 90 and 100. ALTER DATABASE Compatibility Level

Apparently new data types is not affected. I think that compatibility level is there to make SQL Server "behave" like the older version, not prevent you from doing new fancy stuff.

How is a CLR table valued function 'streaming''?

5 votes

The MSDN Docs on table-valued Sql Clr functions states:

Transact-SQL table-valued functions materialize the results of calling the function into an intermediate table. ... In contrast, CLR table-valued functions represent a streaming alternative. There is no requirement that the entire set of results be materialized in a single table. The IEnumerable object returned by the managed function is directly called by the execution plan of the query that calls the table-valued function, and the results are consumed in an incremental manner. ... It is also a better alternative if you have very large numbers of rows returned, because they do not have to be materialized in memory as a whole.

Then I find out that no data access is allowed in the 'Fill row' method. This means that you still have to do all of your data access in the init method and keep it in memory, waiting for 'Fill row' to be called. Have I misunderstood something? If I don't force my results into an array or list, I get an error: 'ExecuteReader requires an open and available Connection. The connection's current state is closed.'

Code sample:

[<SqlFunction(DataAccess = DataAccessKind.Read, FillRowMethodName = "Example8Row")>]
static member InitExample8() : System.Collections.IEnumerable = 
   let c = cn() // opens a context connection
   // I'd like to avoid forcing enumeration here:
   let data = getData c |> Array.ofSeq
   data :> System.Collections.IEnumerable

static member Example8Row ((obj : Object),(ssn: SqlChars byref)) = 
   do ssn <- new SqlChars(new SqlString(obj :?> string))
   ()

I'm dealing with several million rows here. Is there any way to do this lazily?

I'm assuming you're using SQL Server 2008. As mentioned by a Microsoft employee on this page, 2008 requires methods to be marked with DataAccessKind.Read much more frequently than 2005. One of those times is when the TVF participates in a transaction (which seemed to always be the case, when I tested). The solution is to specify enlist=false in the connection string, which, alas, cannot be combined with context connection=true. That means your connection string needs to be in typical client format: Data Source=.;Initial Catalog=MyDb;Integrated Security=sspi;Enlist=false and your assembly must be created with permission_set=external_access, at minimum. The following works:

using System;
using System.Collections;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

namespace SqlClrTest {
    public static class Test {
        [SqlFunction(
            DataAccess = DataAccessKind.Read,
            SystemDataAccess = SystemDataAccessKind.Read,
            TableDefinition = "RowNumber int",
            FillRowMethodName = "FillRow"
            )]
        public static IEnumerable MyTest(SqlInt32 databaseID) {
            using (var con = new SqlConnection("data source=.;initial catalog=TEST;integrated security=sspi;enlist=false")) {
                con.Open();
                using (var cmd = new SqlCommand("select top (100) RowNumber from SSP1 where DatabaseID = @DatabaseID", con)) {
                    cmd.Parameters.AddWithValue("@DatabaseID", databaseID.IsNull ? (object)DBNull.Value : databaseID.Value);
                    using (var reader = cmd.ExecuteReader()) {
                        while (reader.Read())
                            yield return reader.GetInt32(0);
                    }
                }
            }
        }
        public static void FillRow(object obj, out SqlInt32 rowNumber) {
            rowNumber = (int)obj;
        }
    }
}

Here's the same thing in F#:

namespace SqlClrTest

module Test =

    open System
    open System.Data
    open System.Data.SqlClient
    open System.Data.SqlTypes
    open Microsoft.SqlServer.Server

    [<SqlFunction(
        DataAccess = DataAccessKind.Read,
        SystemDataAccess = SystemDataAccessKind.Read,
        TableDefinition = "RowNumber int",
        FillRowMethodName = "FillRow"
        )>]
    let MyTest (databaseID:SqlInt32) =
        seq {
            use con = new SqlConnection("data source=.;initial catalog=TEST;integrated security=sspi;enlist=false")
            con.Open()
            use cmd = new SqlCommand("select top (100) RowNumber from SSP1 where DatabaseID = @DatabaseID", con)
            cmd.Parameters.AddWithValue("@DatabaseID", if databaseID.IsNull then box DBNull.Value else box databaseID.Value) |> ignore
            use reader = cmd.ExecuteReader()
            while reader.Read() do
                yield reader.GetInt32(0)
        } :> System.Collections.IEnumerable

    let FillRow (obj:obj) (rowNumber:SqlInt32 byref) =
        rowNumber <- SqlInt32(unbox obj)

The good news is: Microsoft considers this a bug.

Techniques for Data Aging

5 votes

Hello, I'm looking for information about how to age data in a db, generally related to Oracle and Sql Server, but any database would be good. Any examples or books containing examples of how the best techniques would be cool.

Bob

In Oracle, partitioning is a very useful for implementing Information Life-cycle Management, this enables you to manage data partition wise and store recent, most accessed data on quicker storage and older data, most of the times less often accessed on cheaper storage. IF this is what you are trying to do, take a look at partitioning. In 11g: interval partitioning, takes out having to pre configure partitions; partitions are now created on as needed basis and reference partitioning. This is also a performance booster because it is now easier to do partition wise joins using PQ. It also saves space because the redundant key information is now in the partition definition.