Selecting from a column with possible NULLs in T-SQL

I write blog posts for different reasons. Sometimes it’s because things aren’t well documented elsewhere, sometimes it’s new and interesting and sometimes it’s purely for my own amusement. Occasionally I write things as I kind of penance – things I should know about but bizarrely forgot or misunderstood. This is one of those. It’s how to select from a column with possible NULLs and not omit expected rows from the results set. This has been documented at length elsewhere but that’s not the point of penance.

Test Cases – equals operator

Working through some examples when comparing NULLs with the equals operator.

NULL is equal to ‘test’

False: as expected

NULL is not equal ‘test’

False:  less expected

NULL is not equal to NULL

False: more expected

NULL is equal to NULL

False: odd again

NULL isn’t equal to anything including itself when using the equals operator. I know this really and we see the same kind of thing in JavaScript.

Test cases – IS NULL

Of course it well known that equals doesn’t do the job with NULLS. The IS NULL operator is correct

Null is null

True.

Null is not null

False.

All as expected.

Test case -selecting from a column with possible NULLs

So this is what tripped me up – when you have a column that is nullable and trying to filter as a string.

I naively thought that there are 2 rows that aren’t ‘Test’ so we will get those. But since NULL is not equal to ‘Test’ and not not equal to ‘Test’ then actually we get one less row than we expected

The correct select is

Which converts the NULL to an empty string so then works

Also works and is the same. Penitential blog post over.

Useful Links

https://stackoverflow.com/questions/18828641/sql-difference-between-coalesce-and-isnull
Comparison of ISNULL and COALESCE

http://adripofjavascript.com/blog/drips/equals-equals-null-in-javascript.html
It’s not just T-SQL. Special handling around NULLs in JavaScript

SQL Server restore fails if backups taken in rapid succession

How I (unfairly) think of SQL Server

I don’t spend an awful lot of time thinking about SQL Server. To me it’s a workhorse which I think is a compliment. * It’s steady, reliable and gets the job done. It’s not glamorous though and people don’t talk about it at dinner parties.

I did have cause to give the old SQL Server workhorse a bit more thought recently. I’m having to do a bit of database work and I’m taking lots of backups in rapid succession and restoring them. For some reason the backups didn’t restore the most recent version of the database. The restored DB was weirdly out of date. I got round this by going to options on the backup dialog and selecting ‘Overwrite existing backup sets’

Once I’d blatted out the existing sets then it worked fine. I don’t know why it wouldn’t work if I appended to backup sets and clearly this is a development environment workaround. I’m not advocating people merrily wiping out backup sets on production sets. A better workaround might have been to start using snapshots but once the workhorse was lashed back to the plough then I was happy. That said, I’d be very interested to know why it was doing that if anyone knows.

* On reflection, I wouldn’t like it if called me a workhorse so perhaps it’s not as much of a compliment as I imagine.

T-SQL Random Data Generator

I’ve often had the need to populate database tables with chunks of random data for testing and so forth and this random data generator stored procedure does the trick for me. I’ve had it knocking round in my sock draw for quite a while so I’ve dusted it off and updated it. It actually does a bit more than I remembered. I’ve used it on SQL Server 2005 and 2008 and the INFORMATION_SCHEMA use should make it forwardly compatible.

Code

usp_RandomInsertGenerator

This procedure actually does the heavy lifting and generates the insert statements.

usp_RandomInsertGeneratorWrapper

It’s very easy to create a wrapper stored procedure to get multiple statements. One row of random data isn’t that much use – 1000 might be.

Usage

To get an insert for one line of data

To get multiple insert statements call the wrapper function

Notes and disclaimers

What it does

  • It outputs an insert T-SQL statement for any named table in the database.
  • Detects foreign keys and inserts and makes a random but valid link to the child table (this was the hard part)
  • Detect different data types and does the necessary SQL formatting
  • Accounts for max character length in string type fields
  • Accounts for identity columns
  • Detects nullable columns and leaves them null

What it doesn’t do

  • Doesn’t account for unique indexes
  • There are omitted datatypes (cursor, timestamp, binary, varbinary, image) that I’m not really interested in right now.

Hacks and shortcuts

Where the foreign key links are linked to a child table via a unique index rather than a primary key it assumes a foreign key of 1. The INFORMATION_SCHEMA views don’t include index information. So this would necessitate delving into the sys.index table. Not impossible but the proc was complicated enough as it was. I didn’t need it so I didn’t do it.

Assumptions

  1. Foreign keys are of a numeric type i.e. int, float et. al.
  2. Foreign key constraints exist on the database. It isn’t magic – it can’t work out the relationships if they don’t exist.

Room for improvement

Here’s some extensions that may be useful for individual projects

  1. Implement missing datatypes.
  2. Create custom inserts for known field names to give more realistic data for your application
  3. Randomise the character data properly.

So that’s it. Live it, love it and fiddle about with it a bit.

Useful links

The official documentation for the INFORMATION_SCHEMA used in this procedure is here
https://msdn.microsoft.com/en-us/library/ms186778.aspx

As ever the source code is on GitHub. There is nothing more than what is here but my WordPress code formatting plugin hasn’t made a great job of formatting the SQL code so the GitHub link might give clearer view of the SQL. Also, it seemed cruel not to give it a home on my demo repository with the rest of my code samples.
https://github.com/timbrownls20/Demo/blob/master/T-SQL/usp_RandomInsertGenerator.sql