Best sql-server questions in June 2011

Adding two .NET SqlDecimals increases precision?

12 votes

in .NET, when I add two SqlDecimals, like so:

SqlDecimal s1 = new SqlDecimal(1);
SqlDecimal s2 = new SqlDecimal(1);
SqlDecimal s3 = s1 + s2;

then s3 has precision 2, whereas both s1 and s2 have precision 1.

This seems odd, especially as the documentation states that the return value of the addition operator is "A new SqlDecimal structure whose Value property contains the sum." I.e. according to the documentation, addition should not change the precision.

Am I missing something here? Is this intended behaviour?

Cheers,

Tilman

This article (http://msdn.microsoft.com/en-us/library/ms190476.aspx) explains the behavior for the SQL types, and I assume the .NET Sql data types reflect that in their behavior.

Sql Server JDBC Connection Reset Error : Only on Amazon EC2

10 votes

Context: The Cloud

We have a java-based web application that we normally host on our own servers. Recently we used Amazon Web Services (AWS EC2) cloud to host an instance.

This "cloud setup" matches our typical "on site" setup: one server for the app server, another server for the database server. (Several app servers point to the same database server)

The problem In this cloud setup, we receive intermittent "connection reset by peer errors" between the database and the jdbc driver, where at (seemingly) random intervals and at random points in the codebase, the database connection fails.

Here are a few error excerpts for the log

Stack Trace Example 1:

at com.participate.pe.genericdisplay.client.taglib.GenDisplayViewTag.doStartTag(GenDisplayViewTag.java:77)
    ... 75 more
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.
    at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:170)
    at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:304)
    at com.microsoft.sqlserver.jdbc.SQLServerConnection.getMetaData(SQLServerConnection.java:1734)
    at org.jboss.resource.adapter.jdbc.WrappedConnection.getMetaData(WrappedConnection.java:354)

Stack Trace Example 2

    at java.lang.Thread.run(Thread.java:619)
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Connection reset
    at com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:1368)
    at com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:1355)
    at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:1532)
    at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:3274)
    at com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:4437)
    at com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:4389)
    at com.microsoft.sqlserver.jdbc.SQLServerConnection$1ConnectionCommand.doExecute(SQLServerConnection.java:1457)
    at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4026)
    at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:1416)
    at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectionCommand(SQLServerConnection.java:1462)
    at com.microsoft.sqlserver.jdbc.SQLServerConnection.setAutoCommit(SQLServerConnection.java:1610)
    at org.jboss.resource.adapter.jdbc.BaseWrapperManagedConnection.checkTransaction(BaseWrapperManagedConnection.java:429)

Technical Environment

  • Jboss 4.2.2.GA (Jboss-Web 2.0/ Tomcat 6)
  • MSSQL 2005 2.0 jdbc driver

Some points

  • We have never seen this problem in our own environment (i.e. own data centers) running the application for several years
  • This led me to conclude "something funny is going on with Amazon network environment". I may be wrong/missing something/etc.
  • This problem only occurs with our application. We have other java and php applications which have not had this problem. The other java application uses a different jdbc driver (jtds, afaik)
  • It doesn't seem like a simple connection timeout

Questions

-Has anyone seen this before? -If it's an EC2 "known issue", can we configure our way around the problem (i.e. make sure everything is on its own subnet or virtual private cloud (vpc) ? -Any jdbc driver settings to get past this problem?

** Update ** I've extended and increased the bounty on this question.

On extra bit of information: the two virtual servers (database and application server) were on different subnets--i.e. one hop between the two servers.

In a non-cloud environment we have "zero hops" bewtewn the two servers.

Our hosting admins said we had no control over the subnets of our EC2 instances. This made me wonder if virtual private cloud would help.

thanks in advance

will

Just a word of caution on usind DBCP/connection pool features to mitigate the issue - the more you enable 'testOnBorrow' and other features, the more you can introduce latency or other performance changing affects on the system. I don't know if DBCP still does this or not, but a few years ago it would generate actual test queries to test the connection - full stack, database responses - not just at the network layer. The above link from Brian brings back horrific memories from the early 2000s on surrounding re-try logic for JDBC connection management.

Anyway, it's tough to really root cause this, other than gather evidence and eliminate the 'seemingly random' to a specific set of conditions:

  • You could try to throw up a Wireshark/PCAP trace, find when it happens, and send the results to both Amazon and Microsoft to see if they can root cause it

  • You could try the above with certain test harnesses to isolate the problem (JMeter tests to get concurrency up), bounce the network connection, watch for recovery, etc

  • You could try alternative versions of SQL Server to discount a SQL Server/JDBC driver bug that has since been fixed.

  • If DNS is used in connection strings, could use IP addresses to validate nslookup issues

I'm not a SQL Server expert, but another route for research could be within the related products domain - e.g. see if anyone experienced similar issues with TFS/Sharepoint (e.g. such as http://nickhoggard.wordpress.com/2009/12/07/further-experiences-with-tfs-2010-beta-2-on-amazon-ec2/ )

How to design Date-of-Birth in DB and ORM for mix of known and unkown dates

10 votes

Note up front, my question turns out to be similar to SO question 1668172.


This is a design question that surely must have popped up for others before, yet I couldn't find an answer that fits my situation. I want to record date-of-birth in my application, with several 'levels' of information:

  • NULL value, i.e. DoB is unkown
  • 1950-??-?? Only the DoB year value is known, date/month aren't
  • ????-11-23 Just a month, day, or combination of the two, but without a year
  • 1950-11-23 Full DoB is known

The technologies I'm using for my app are as follows:

  • Asp.NET 4 (C#), probably with MVC
  • Some ORM solution, probably just Linq-to-sql but NHibernate's an option too
  • MSSQL Server 2008, at first just Express edition

Possibilities for the SQL bit that crossed my mind so far:

  • 1) Use one nullable varchar column e.g. 1950-11-23, and replace unkowns with 'X's, e.g. XXXX-11-23 or 1950-XX-XX
  • 2) Use three nullable int columns e.g. 1950, 11, and 23
  • 3) Use an int column for year, plus a datetime column for full known DoBs

For the C# end of this problem I merely got to these two options:

  • A) Use a string property to represent DoB, convert only for view purposes.
  • B) Use a custom(?) struct for DoB with three nullable integers
  • C) Use a nullable datetime alongside a nullable integer for year

The solutions seem to form matched pairs at 1A, 2B or 3C. Of course 1A isn't a nice solution, but it does set a baseline.

Any tips and links are highly appreciated. Well, if they're related, anyhow :)

Edit, about the answers: I marked one answer as accepted, because I think it will work for me. It's worth looking at the other answers too though, if you've stumbled here with the same question.

The SQL Side

My latest idea on this subject is to use a range for dates that are uncertain or can have different specificity. Given two columns:

DobFromDate (inclusive)
DobToDate (exclusive)

Here's how it would work with your scenarios:

Specificity   DobFromDate  DobToDate
YMD           2006-05-05    2006-05-06
YM            2006-05-01    2006-06-01
Y             2006-01-01    2007-01-01
Unknown       0000-01-01    9999-12-31
MD, M, D      Not supported with this scheme

Note that there's no reason this couldn't be carried all the way to hour, minute, second, millisecond, and so on.

Then when querying for people born on a specific day:

DECLARE @BornOnDay date = '2006-05-16'

-- Include lower specificity:
SELECT *
FROM TheTable
WHERE
   DobFromDate <= @BornOnDay
   AND @BornOnDay < DobToDate;

-- Exclude lower specificity:
SELECT *
FROM TheTable
WHERE
   DobFromDate = @BornOnDay
   AND DobToDate = DateAdd(Day, 1, @BornOnDay);

This to me has the best mix of maintainability, ease of use, and expressive power. It won't handle loss of precision in the more significant values (e.g., you know the month and day but not the year) but if that can be worked around then I think it is a winner.

If you will ever be querying by date, then in general the better solutions (in my mind) are going to be those that preserve the items as dates on the server in some fashion.

Also, note that if you're looking for a date range rather than a single day, with my solution you still only need two conditions, not four:

DECLARE
   @FromBornOnDay date = '2006-05-16',
   @ToBornOnDay date = '2006-05-23';

-- Include lower specificity:
SELECT *
FROM TheTable
WHERE
   DobFromDate <= @ToBornOnDay
   AND @FromBornOnDay < DobToDate;

The C# Side

I would use a custom class with all the methods needed to do appropriate date math and date comparisons on it. You know the business requirements for how you will use dates that are unknown, and can encode the logic within the class. If you need something before a certain date, will you use only known or unknown items? What will ToString() return? These are things, in my mind, best solved with a class.

Large SQL transaction: runs out of memory on PostgreSQL, yet works on SQL Server

9 votes

I have decided to move my C# daemon application (using dotConnect as ADO.NET provider) from SQL Server 2008 R2 to PostgreSQL 9.0.4 x64 (on Windows Server 2008 R2). Therefore I slightly modified all queries to match PostgreSQL syntax and... got stuck on behavior which never happened with the same queries on SQL Server (not even on lowly Express edition).

Let's say the database contains 2 very simple tables without any relation to each other. They look somewhat like this: ID, Name, Model, ScanDate, Notes. I have a transformation process which reads data over TCP/IP, processes it, starts a transaction and puts the results into aforementioned 2 tables using vanilla INSERTs. The tables are initially empty; no BLOB columns. There are about 500.000 INSERTs on a bad day, all wrapped in a single transaction (and cannot be split into multiple transactions, btw). No SELECTs, UPDATEs or DELETEs are ever made. An example of INSERT (ID is bigserial - autoincremented automatically):

INSERT INTO logs."Incoming" ("Name", "Model", "ScanDate", "Notes")
VALUES('Ford', 'Focus', '2011-06-01 14:12:32', NULL)

SQL Server calmly accepts the load while maintaining a reasonable Working Set of ~200 MB. PostgreSQL, however, takes up additional 30 MB each second the transaction runs (!) and quickly exhausts system RAM.

I've done my RTFM and tried fiddling with postgresql.conf: setting "work_mem" to a minimum 64 kB (this slightly slowed down the RAM hogging), reducing "shared_buffers" / "temp_buffers" to minimum (no difference), - but to no avail. Reducing transaction isolation level to Read Uncommitted didn't help. There are no indexes except the one on ID BIGSERIAL (PK). SqlCommand.Prepare() makes no difference. No concurrent connections ever are established: daemon uses the database exclusively.

It may seem PostgreSQL cannot cope with mind-numbingly simple INSERT-fest, while SQL Server can do that. Maybe it's a PostgreSQL snapshot-vs-SQL Server locks isolation difference? It's a fact for me: vanilla SQL Server works, while neither vanilla nor tweaked PostgreSQL does.

What can I do to make PostgreSQL memory consumption to remain flat (as is apparently the case with SQL Server) while INSERT-based transaction runs?

EDIT: I have created an artificial testcase:

DDL:

CREATE TABLE sometable
(
  "ID" bigserial NOT NULL,
  "Name" character varying(255) NOT NULL,
  "Model" character varying(255) NOT NULL,
  "ScanDate" date NOT NULL,
  CONSTRAINT "PK" PRIMARY KEY ("ID")
)
WITH (
  OIDS=FALSE
);

C# (requires Devart.Data.dll & Devart.Data.PostgreSql.dll)

PgSqlConnection conn = new PgSqlConnection("Host=localhost; Port=5432; Database=testdb; UserId=postgres; Password=###########");
conn.Open();
PgSqlTransaction tx = conn.BeginTransaction(IsolationLevel.ReadCommitted);

for (int ii = 0; ii < 300000; ii++)
{
    PgSqlCommand cmd = conn.CreateCommand();
    cmd.Transaction = tx;
    cmd.CommandType = CommandType.Text;
    cmd.CommandText = "INSERT INTO public.\"sometable\" (\"Name\", \"Model\", \"ScanDate\") VALUES(@name, @model, @scanDate) RETURNING \"ID\"";
    PgSqlParameter parm = cmd.CreateParameter();
    parm.ParameterName = "@name";
    parm.Value = "SomeName";
    cmd.Parameters.Add(parm);

    parm = cmd.CreateParameter();
    parm.ParameterName = "@model";
    parm.Value = "SomeModel";
    cmd.Parameters.Add(parm);

    parm = cmd.CreateParameter();
    parm.ParameterName = "@scanDate";
    parm.PgSqlType = PgSqlType.Date;
    parm.Value = new DateTime(2011, 6, 1, 14, 12, 13);
    cmd.Parameters.Add(parm);

    cmd.Prepare();

    long newID = (long)cmd.ExecuteScalar();
}

tx.Commit();

This recreates the memory hogging. HOWEVER: if the 'cmd' variable is created and .Prepare()d outside the FOR loop, the memory does not increase! Apparently, preparing multiple PgSqlCommands with IDENTICAL SQL but different parameter values does not result in a single query plan inside PostgreSQL, like it does in SQL Server.

The problem remains: if one uses Fowler's Active Record dp to insert multiple new objects, prepared PgSqlCommand instance sharing is not elegant.

Is there a way/option to facilitate query plan reuse with multiple queries having identical structure yet different argument values?

UPDATE

I've decided to look at the simplest possible case - where a SQL batch is run directly on DBMS, without ADO.NET (suggested by Jordani). Surprisingly, PostgreSQL does not compare incoming SQL queries and does not reuse internal compiled plans - even when incoming query has the same identical arguments! For instance, the following batch:

PostgreSQL (via pgAdmin -> Execute query) -- hogs memory

BEGIN TRANSACTION;

INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES('somename', 'somemodel', '2011-06-01 14:12:19');
INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES('somename', 'somemodel', '2011-06-01 14:12:19');
-- the same INSERT is repeated 100.000 times

COMMIT;

SQL Server (via Management Studio -> Execute) -- keeps memory usage flat

BEGIN TRANSACTION;

INSERT INTO [dbo].sometable ("Name", "Model", "ScanDate") VALUES('somename', 'somemodel', '2011-06-01 14:12:19');
INSERT INTO [dbo].sometable ("Name", "Model", "ScanDate") VALUES('somename', 'somemodel', '2011-06-01 14:12:19');
-- the same INSERT is repeated 100.000 times

COMMIT;

and the PostgreSQL log file (thanks, Sayap!) contains:

2011-06-05 16:06:29 EEST LOG:  duration: 0.000 ms  statement: set client_encoding to 'UNICODE'
2011-06-05 16:06:43 EEST LOG:  duration: 15039.000 ms  statement: BEGIN TRANSACTION;

INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES('somename', 'somemodel', '2011-06-01 14:12:19');
INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES('somename', 'somemodel', '2011-06-01 14:12:19');
-- 99998 lines of the same as above
COMMIT;

Apparently, even after transmitting the whole query to the server as-is, the server cannot optimize it.

ADO.NET driver alternative

As Jordani suggested, I've tried NpgSql driver instead of dotConnect - with the same (lack of) results. However, Npgsql source for .Prepare() method contains such enlightening lines:

planName = m_Connector.NextPlanName();
String portalName = m_Connector.NextPortalName();
parse = new NpgsqlParse(planName, GetParseCommandText(), new Int32[] { });
m_Connector.Parse(parse);

The new content in the log file:

2011-06-05 15:25:26 EEST LOG:  duration: 0.000 ms  statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
2011-06-05 15:25:26 EEST LOG:  duration: 1.000 ms  parse npgsqlplan1: INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES($1::varchar(255), $2::varchar(255), $3::date) RETURNING "ID"
2011-06-05 15:25:26 EEST LOG:  duration: 0.000 ms  bind npgsqlplan1: INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES($1::varchar(255), $2::varchar(255), $3::date) RETURNING "ID"
2011-06-05 15:25:26 EEST DETAIL:  parameters: $1 = 'SomeName', $2 = 'SomeModel', $3 = '2011-06-01'
2011-06-05 15:25:26 EEST LOG:  duration: 1.000 ms  execute npgsqlplan1: INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES($1::varchar(255), $2::varchar(255), $3::date) RETURNING "ID"
2011-06-05 15:25:26 EEST DETAIL:  parameters: $1 = 'SomeName', $2 = 'SomeModel', $3 = '2011-06-01'
2011-06-05 15:25:26 EEST LOG:  duration: 0.000 ms  parse npgsqlplan2: INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES($1::varchar(255), $2::varchar(255), $3::date) RETURNING "ID"
2011-06-05 15:25:26 EEST LOG:  duration: 0.000 ms  bind npgsqlplan2: INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES($1::varchar(255), $2::varchar(255), $3::date) RETURNING "ID"
2011-06-05 15:25:26 EEST DETAIL:  parameters: $1 = 'SomeName', $2 = 'SomeModel', $3 = '2011-06-01'
2011-06-05 15:25:26 EEST LOG:  duration: 0.000 ms  execute npgsqlplan2: INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES($1::varchar(255), $2::varchar(255), $3::date) RETURNING "ID"
2011-06-05 15:25:26 EEST DETAIL:  parameters: $1 = 'SomeName', $2 = 'SomeModel', $3 = '2011-06-01'
2011-06-05 15:25:26 EEST LOG:  duration: 0.000 ms  parse npgsqlplan3: INSERT INTO public."sometable" ("Name", "Model", "ScanDate") VALUES($1::varchar(255), $2::varchar(255), $3::date) RETURNING "ID"

Inefficiency is quite obvious in this log excerpt...

Conclusions (such as they are)

Frank's note about WAL is another awakening: something else to configure that SQL Server hides away from a typical MS developer.

NHibernate (even in its simplest usage) reuses prepared SqlCommands properly...if only it was used from the start...

it is obvious that an architectural difference exists between SQL Server and PostgreSQL, and the code specifically built for SQL Server (and thus blissfully unaware of the 'unable-to-reuse-identical-sql' possibility) will not work efficiently on PostgreSQL without major refactoring. And refactoring 130+ legacy ActiveRecord classes to reuse prepared SqlCommand objects in a messy multithreaded middleware is not a 'just-replace-dbo-with-public'-type affair.

Unfortunately for my overtime, Eevar's answer is correct :)

Thanks to everyone who pitched in!

I suspect you figured it out yourself. You're probably creating 500k different prepared statements, query plans and all. Actually, it's worse than that; prepared statements live outside of transaction boundaries and persist until the connection is closed. Abusing them like this will drain plenty of memory.

If you want to execute a query several times but avoid the planning overhead for each execution, create a single prepared statement and reuse that with new parameters.

If your queries are unique and ad-hoc, just use postgres' normal support for bind variables; no need for the extra overhead from prepared statements.

SQL Server aggregate performance

9 votes

I am wondering whether SQL Server knows to 'cache' if you like aggregates while in a query, if they are used again.

For example,

Select Sum(Field),
       Sum(Field) / 12
From   Table

Would SQL Server know that it has already calculated the Sum function on the first field and then just divide it by 12 for the second? Or would it run the Sum function again then divide it by 12?

Thanks

It calculates once

Select
   Sum(Price),
   Sum(Price) / 12
From
   MyTable

The plan gives:

|--Compute Scalar(DEFINE:([Expr1004]=[Expr1003]/(12.)))
  |--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1010]=(0) THEN NULL ELSE [Expr1011] END))
     |--Stream Aggregate(DEFINE:([Expr1010]=Count(*), [Expr1011]=SUM([myDB].[dbo].[MyTable].[Price])))
        |--Index Scan(OBJECT:([myDB].[dbo].[MyTable].[IX_SomeThing]))

This table has 1.35 million rows

SQL Select 'n' records without a Table

8 votes

Is there a way of selecting a specific number of rows without creating a table. e.g. if i use the following:

SELECT 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

It will give me 10 across, I want 10 New Rows.

Thanks

You can use a recursive CTE to generate an arbitrary sequence of numbers in T-SQL like so:

DECLARE @start INT = 1;
DECLARE @end INT = 10;

WITH numbers AS (
    SELECT @start AS number
    UNION ALL
    SELECT number + 1 
    FROM  numbers
    WHERE number < @end
)
SELECT *
FROM numbers
OPTION (MAXRECURSION 0);

SQL Parser in Delphi for SQL Server

7 votes

I am looking for a good SQL parser in Delphi (2010) for SQL Server. I need such thing to parse an query and extract: select list, where and order by clauses. It does not matter if it is a commercial or an open-source solution. It can also be a DLL (written in any language, of course) but I prefer a VCL component over a DLL.

Rafael Delphi comes with a sql parser located in the DBCommon unit

Check these functions

function NextSQLToken(var p: PAnsiChar; out Token: AnsiString; CurSection: TSQLToken): TSQLToken; overload;
function NextSQLToken(var p: PWideChar; out Token: WideString; CurSection: TSQLToken): TSQLToken; overload;
function NextSQLToken(var p: PChar; out Token: String; CurSection: TSQLToken): TSQLToken; overload;

function GetIndexForOrderBy(const SQL: WideString; DataSet: TDataSet): TIndexDef;
function GetTableNameFromSQL(const SQL: WideString): WideString;
function GetTableNameFromQuery(const SQL: Widestring): Widestring;
function AddParamSQLForDetail(Params: TParams; SQL: WideString; Native: Boolean; QuoteChar: WideString = ''): WideString;
function IsMultiTableQuery(const SQL: WideString): Boolean;
function SQLRequiresParams(const SQL: WideString): Boolean;

function NextSQLTokenEx(var p: PWideChar; out Token: UnicodeString; CurSection: TSQLToken; IdOption: IDENTIFIEROption): TSQLToken; overload;
function NextSQLTokenEx(var p: PWideChar; out Token: WideString; CurSection: TSQLToken; IdOption: IDENTIFIEROption): TSQLToken; overload;
function NextSQLTokenEx(var p: PAnsiChar; out Token: AnsiString; CurSection: TSQLToken; IdOption: IDENTIFIEROption): TSQLToken; overload;
function GetTableNameFromSQLEx(const SQL: WideString; IdOption: IDENTIFIEROption): WideString;

This is a very simple sample to show how parse a sql sentence and get all the elements.

uses
  TypInfo,
  DbCommon,
  SysUtils;


const
  StrSql ='Select Field1, Field2, 54 field3, Field4 from Mytable1 Order by Field1,Field5';

procedure ParseSql(Const Sql : string);
var
  SQLToken     : TSQLToken;
  CurSection   : TSQLToken;
  Start        : PWideChar;
  Token        : WideString;
  IdOption     : IDENTIFIEROption;
begin
  IdOption   :=idMixCase;
  Start      :=PWideChar(StrSql);
  CurSection := stUnknown;
  repeat
    SQLToken := NextSQLTokenEx(Start, Token, CurSection, IdOption);
    if SQLToken<>stEnd then
    Writeln(Format('Type %s Token %s', [GetEnumName(TypeInfo(TSQLToken), integer(SQLToken)),Token]));
    CurSection := SQLToken;
  until SQLToken in [stEnd];
end;



begin
  try
    ParseSql(StrSql);
  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;
  Readln;
end.

this will return

Type stSelect Token Select
Type stFieldName Token Field1
Type stFieldName Token Field2
Type stNumber Token 54
Type stFieldName Token field3
Type stFieldName Token Field4
Type stFrom Token from
Type stTableName Token Mytable1
Type stOrderBy Token Order by
Type stFieldName Token Field1
Type stFieldName Token Field5

Can R produce on-the-fly graphs for website?

7 votes

I use a Flex/ColdFusion/MSSQl combo to take input from users to generate charts for a website . Is this possible in R? I have used RODBC and sqlQuery as a way of producing static graphs but cannot seem to find a way of doing it dynamically. Over to you JU

Of course you can, you can use fantastic Apache module that allows stateless execution of R scripts - RApache. You can define an R script and catch (unserialize) plot parameters (e.g. via JSON or URL encoded string), plot the graph, and load result(s) with AJAX. That's pretty much what I did in my app.

If you're not satisfied with R graph capabilities (and I'm sure that's so not gonna happen), you can try out googleVis or canvas packages. The first one is "only" a library for GoogleVis API for R, and I'm sure you'll like the later if you're familiar with HTML5 canvas. Some lads found it useful.

So, the final answer is, yes, you can!. You only need to decide whether you're going to generate graphs on client or server-side. Of course, even if you decide to generate graphs on the client side, you must massage your data in R and return it in serialized form (JSON or XML encoded). I know that ExtJS 4 also has good interface for creating client-side graphs, but I haven't used it much (read: "at all").

SQL Server 2008 Query Editor changes the query logic

7 votes

I hand coded a simple SQL in SQL Server 2008 as below;

SELECT * FROM Tab1 WHERE A='1' AND (B='1' OR C='1');

Being lazy I opened this query in the Query Editor to validate the syntax and pressed OK on the dialog without making any changes.

I noticed that the Query Editor had changed my query to:

SELECT * FROM Tab1 WHERE A='1' AND (B='1') OR (C='1');

clearly this changes the logic of the SQL and returns different results depending on which one you execute.

I routinely use the Query Editor to validate my syntax on complex queries. So a little worried that the a subtle change like this would go unotice, but would change the outcome.

Is this a feature of the designer? Is there something I can do to change this behavior?

EDIT: Thanks for pointing out that the changes made by the editor is not quite the same as above, but still the query is modified although the results are the same.

Thanks

I tried to replicate this in the Query Designer and had a slightly different result. I typed the same as you:

SELECT * FROM Tab1 WHERE A='1' AND (B='1' OR C='1');

And got this:

SELECT     *
FROM         Tab1
WHERE     (A = '1') AND (B = '1') OR
                      (A = '1') AND (C = '1')

I have to say that the result is the same, but we can all see a dangerous road here. Also, I did not like the (A = '1') replication. Heck, I want the code how I coded it!

A word to the wise: I never format my queries in SQL Server Management Studio. Have you seen what it does to your view's code? I hate it. I just code somewhere else and paste in SMS when done.

What's the best way send email from the database in SQL Server 2008 R2?

6 votes

Migrating a legacy database from SQL Server 2000, and there are a few scheduled tasks that use SP_OACreate & CDOSYS to send emails to an open SMTP relay as in this KB article: http://support.microsoft.com/kb/312839

In SQL Server 2008 R2, what's the best (simplest) way to replace the sendmail functionality? It seems like DB Mail is the new best recommended way, anyone have much experience with it or another alternative?

Yes, DB Mail is by far the best way: it's simple SMTP.

Don't think about using:

  • sp_OA% and cdosys (sp_OA% can be tricky and leak resources)
  • xp_sendmail (require local Outlook install for MAPI client)

Update with sub select - How to handle NULL values?

6 votes

I'm trying an update with a conditional sub-select which could return null...

UPDATE 
aTable SET 
aColumn = 
(   
    SELECT TOP 1    
        CASE 
            WHEN bTable.someColumn = 1 THEN someValue1 
            WHEN bTable.someColumn = 2 THEN someValue2 
            ELSE someValue3
        END  
    FROM         
        bTable
    WHERE
        bTable = @someCriteria
    ORDER BY
        someSortColumn
) WHERE 
aTable.id = @someId;

If the "bTable = @someCriteria" clause causes no results to be returned from the SELECT, it attempts to insert a NULL into "aColumn", which in this case is a NOT NULL column.

Question

How do I get it to simply leave "aColumn" alone in this circumstance?

Many thanks.

...
aColumn = 

    ISNULL(
        (   
            SELECT TOP 1    
                CASE 
                    WHEN bTable.someColumn = 1 THEN someValue1 
                    WHEN bTable.someColumn = 2 THEN someValue2 
                    ELSE someValue3
                END  
            FROM         
                bTable
            WHERE
                bTable = @someCriteria
            ORDER BY
                someSortColumn
        ), aColumn)
...

A Never Delete Relational DB Schema Design

6 votes

I am considering designing a relational DB schema for a DB that never actually deletes anything (sets a deleted flag or something).

1) What metadata columns are typically used to accomodate such an architecture? Obviously a boolean flag for IsDeleted can be set. Or maybe just a timestamp in a Deleted column works better, or possibly both. I'm not sure which method will cause me more problems in the long run.

2) How are updates typically handled in such architectures? If you mark the old value as deleted and insert a new one, you will run into PK unique constraint issues (e.g. if you have PK column id, then the new row must have the same id as the one you just marked as invalid, or else all of your foreign keys in other tables for that id will be rendered useless).

Here are some additional questions that you'll also want to consider

  1. How often do deletes occur. What's your performance budget like? This can affect your choices. The answer to your design will be different depending of if a user deleting a single row (like lets say an answer on a Q&A site vs deleting records on an hourly basis from a feed)

  2. How are you going to expose the deleted records in your system. Is it only through administrative purposes or can any user see deleted records. This makes a difference because you'll probably need to come up with a filtering mechanism depending on the user.

  3. How will foreign key constraints work. Can one table reference another table where there's a deleted record?

  4. When you add or alter existing tables what happens to the deleted records?

Typically the systems that care a lot about audit use tables as Steve Prentice mentioned. It often has every field from the original table with all the constraints turned off. It often will have a action field to track updates vs deletes, and include a date/timestamp of the change along with the user.

For an example see the PostHistory Table at http://data.stackexchange.com/stackoverflow/query/new

Compatible SQL to test for not null and not empty strings

6 votes

I want to have compatible SQL for both Oracle database and Microsoft SQL server.

I want a compatible SQL expression that will return true for not null and not empty strings.

If I use:

column <> ''

it will work on Microsoft SQL server but not on Oracle database (as '' is null for Oracle)

If I use:

len(column) > 0

it will work on Microsoft SQL server but not on Oracle database (since it uses length() )

NULLIF is available on both Oracle (doc) and SQL Server (doc). This expression should work:

NULLIF(column, '') IS NOT NULL

In both servers, if column is NULL, then the output of NULLIF will just pass the NULL value through. On SQL Server, '' = '', so the output of NULLIF will be NULL. On Oracle, '' is already NULL, so it gets passed through.

This is my test on SQL Server 2008 R2 Express:

WITH SampleData AS
    (SELECT 1 AS col1, CAST(NULL AS varchar(10)) AS col2
     UNION ALL
     SELECT 2, ''
     UNION ALL
     SELECT 3, 'hello')
SELECT *
  FROM SampleData
 WHERE NULLIF(col2, '') IS NOT NULL;

And this is my test case on Oracle 10g XE:

WITH SampleData AS
    (SELECT 1 AS col1, NULL AS col2 FROM DUAL
     UNION ALL
     SELECT 2, '' FROM DUAL
     UNION ALL
     SELECT 3, 'hello' FROM DUAL)
SELECT *
  FROM SampleData
 WHERE NULLIF(col2, '') IS NOT NULL;

Both return 3 as expected.