Code Buckets

Buckets of code

Data and Data Access

T-SQL Random Data Generator

I’ve often had the need to populate database tables with chunks of random data for testing and so forth and this random data generator stored procedure does the trick for me. I’ve had it knocking round in my sock draw for quite a while so I’ve dusted it off and updated it. It actually does a bit more than I remembered. I’ve used it on SQL Server 2005 and 2008 and the INFORMATION_SCHEMA use should make it forwardly compatible.

Code

usp_RandomInsertGenerator

This procedure actually does the heavy lifting and generates the insert statements.

Create Procedure usp_RandomInsertGenerator
(
 @TableName nvarchar(128)
)
As

-- does foreign keys
-- max character length
-- standard data types
-- identity columns
-- detects nullable columns and leaves as null

-- does not do
-- omitted datatypes - binary, varbinary, image, timestamp

-- Hacks
-- foreign key links where linked to child table via a unique index rather than a primary key. Assumpts FK of 1

-- assumes 
-- foreign keys are of a numeric type i.e. ints, tinyints, floats etc...
-- assumes foreign key constraints exist

Begin

Set NoCount On

Declare @ColumnName nvarchar(128)
 Declare @FK_TableName nvarchar(128)
 Declare @FK_ColumnName nvarchar(128)
 Declare @ConstraintName nvarchar(128)
 Declare @DataType nvarchar(128)
 Declare @CharacterMaximumLength int
 Declare @Sql nvarchar(max)
 Declare @MaxValue int
 Declare @InsertValue nvarchar(400)
 Declare @SqlOutputFields nvarchar(max)
 Declare @SqlOutputValues nvarchar(max)
 Declare @FirstLoop bit
 Declare @IsIdentity bit
 Declare @StringInsert bit
 Declare @DummyText varchar(48)

Set @FirstLoop = 1
 Set @SqlOutputFields = 'INSERT INTO ' + @TableName + ' ('
 Set @SqlOutputValues = ' VALUES ('
 Set @DummyText = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'

-- cursor loops through every column for named table
 Declare procCursor CURSOR FORWARD_ONLY
 For 
 Select 
 COLUMN_NAME, 
 DATA_TYPE, 
 CHARACTER_MAXIMUM_LENGTH 
 From 
 INFORMATION_SCHEMA.COLUMNS col
 Where 
 TABLE_NAME = @TableName
 And IS_NULLABLE = 'NO'
 Order By 
 ORDINAL_POSITION

Open procCursor
 Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength

while @@fetch_status <> -1
 Begin
 
 -- datatypes i have not bothered implementing
 if @DataType = 'cursor' OR @DataType = 'timestamp' 
 Or @DataType = 'binary' Or @DataType = 'varbinary' Or @DataType = 'image'
 Begin
 Raiserror('Unsupported Data Type', 1, 16)
 End

--reset variables
 Set @FK_TableName = ''
 Set @FK_ColumnName = ''
 Set @StringInsert = 0
 Set @ConstraintName = ''
 
 -- Do not add in an insert value if the loop is an identity
 Select @IsIdentity = COLUMNPROPERTY(OBJECT_ID(@TableName),@ColumnName,'IsIdentity')
 if @IsIdentity = 1 
 Begin
 Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength
 continue
 End
 
 -- getting the value to be inserted for this data type
 if @DataType = 'varchar' Or @DataType = 'nvarchar' 
 Or @DataType = 'char' Or @DataType = 'nchar'
 Or @DataType = 'text' Or @DataType = 'ntext'
 Begin
 Set @StringInsert = 1
 Set @InsertValue = SubString(@DummyText, 1, @CharacterMaximumLength)
 End
 Else if @DataType = 'datetime' Or @DataType = 'smalldatetime'
 Begin
 Set @StringInsert = 1
 Set @InsertValue = Cast(GetDate() as varchar(20))
 End
 Else if @DataType = 'uniqueidentifier'
 Begin
 
 Set @StringInsert = 1
 Set @InsertValue = Cast(NewId() as varchar(200))
 End
 Else -- it is some key of numeric type
 Begin
 
 -- getting the child table indexes
 Set @Sql = '
 Select 
 @FK_TableName = pkconst.TABLE_NAME,
 @FK_ColumnName = pkconst.COLUMN_NAME,
 @ConstraintName = coluse.CONSTRAINT_NAME
 From 
 INFORMATION_SCHEMA.KEY_COLUMN_USAGE coluse 
 Join INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS fkconst 
 On fkconst.CONSTRAINT_NAME = coluse.CONSTRAINT_NAME
 Left Join INFORMATION_SCHEMA.KEY_COLUMN_USAGE pkconst 
 On pkconst.CONSTRAINT_NAME = fkconst.UNIQUE_CONSTRAINT_NAME
 Where 
 coluse.TABLE_NAME = @TableName 
 And coluse.COLUMN_NAME = @ColumnName'


 Execute sp_executesql @Sql, N'@TableName nvarchar(128), 
 @ColumnName nvarchar(128), 
 @FK_TableName nvarchar(128) OUTPUT, 
 @FK_ColumnName nvarchar(128) OUTPUT,
 @ConstraintName nvarchar(128) OUTPUT', 
 @TableName=@TableName, 
 @ColumnName=@ColumnName, 
 @FK_TableName=@FK_TableName OUTPUT, 
 @FK_ColumnName=@FK_ColumnName OUTPUT,
 @ConstraintName=@ConstraintName OUTPUT
 
 if Len(@FK_TableName) > 0 And Len(@FK_ColumnName) > 0
 Begin

/* have found foreign key and the lookup table 
 so pick a random primary key from the lookup table */
 Set @Sql = 'Select top 1 @InsertValue = Cast(' + @FK_ColumnName + 
 ' as varchar(200)) From ' 
 + @FK_TableName + ' Order By newid()'
 Execute sp_executesql @Sql, N'@InsertValue nvarchar(128) OUTPUT', @InsertValue=@InsertValue OUTPUT

End
 Else if(Len(@ConstraintName) > 0)
 Begin
 
 /* OK we've found the foreign key constraint but have no idea what the
 lookup table is. This is because we are joining on a unique index to the 
 lookup table not a primary key. Make a MASSIVE assumption in this instance
 - that the lookup table has a link value of 1 */
 Set @InsertValue = '1'

End
 Else
 Begin
 -- no foreign key so the max that can be inserted is based on the datatype
 -- do not bother with any thing greater than thirty thousand - big enough
 if @DataType = 'bit'
 Set @MaxValue = 1
 else if @DataType = 'tinyint'
 Set @MaxValue = 255
 else if @DataType = 'smallint'
 Set @MaxValue = 32767
 else
 Set @MaxValue = 32767
 
 -- randomly generate a number to insert up to maximum
 Set @InsertValue = Cast(ROUND(((@MaxValue - 1) * RAND() + 1), 0) as varchar(200)) 
 
 End
 End -- end of numeric processing
 
 -- building up output string
 Declare @Delimiter char(1)
 if @FirstLoop = 1
 Begin
 Set @FirstLoop = 0
 Set @Delimiter = ''
 End
 Else
 Begin
 Set @Delimiter = ','
 End

Set @SqlOutputFields = @SqlOutputFields + @Delimiter + @ColumnName
 
 if @StringInsert = 1
 Begin
 Set @SqlOutputValues = @SqlOutputValues + @Delimiter + '''' + @InsertValue + ''''
 End
 Else
 Begin
 Set @SqlOutputValues = @SqlOutputValues + @Delimiter + @InsertValue 
 End

Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength
 End -- finished this column = go to next

close procCursor
 deallocate procCursor

-- outputting the sql string
 Set @SqlOutputFields = @SqlOutputFields + ')'
 Set @SqlOutputValues = @SqlOutputValues + ')'

select @SqlOutputFields + ' ' + @SqlOutputValues

End 
Go

usp_RandomInsertGeneratorWrapper

It’s very easy to create a wrapper stored procedure to get multiple statements. One row of random data isn’t that much use – 1000 might be.

Create Procedure usp_RandomInsertGeneratorWrapper
(
 @TableName nvarchar(128),
 @NumberOfInserts int
)
As
Begin
 
 Set NoCount On

Declare @StatementCount int
 Set @StatementCount = 0

While @StatementCount < @NumberOfInserts
 Begin
 
 exec usp_RandomInsertGenerator @TableName
 Set @StatementCount = @StatementCount + 1
 End

End

Usage

To get an insert for one line of data

exec usp_RandomInsertGenerator ‘MyTableName’

To get multiple insert statements call the wrapper function

usp_RandomInsertGeneratorWrapper ‘MyTableName’, 1000

Notes and disclaimers

What it does

  • It outputs an insert T-SQL statement for any named table in the database.
  • Detects foreign keys and inserts and makes a random but valid link to the child table (this was the hard part)
  • Detect different data types and does the necessary SQL formatting
  • Accounts for max character length in string type fields
  • Accounts for identity columns
  • Detects nullable columns and leaves them null

What it doesn’t do

  • Doesn’t account for unique indexes
  • There are omitted datatypes (cursor, timestamp, binary, varbinary, image) that I’m not really interested in right now.

Hacks and shortcuts

Where the foreign key links are linked to a child table via a unique index rather than a primary key it assumes a foreign key of 1. The INFORMATION_SCHEMA views don’t include index information. So this would necessitate delving into the sys.index table. Not impossible but the proc was complicated enough as it was. I didn’t need it so I didn’t do it.

Assumptions

  1. Foreign keys are of a numeric type i.e. int, float et. al.
  2. Foreign key constraints exist on the database. It isn’t magic – it can’t work out the relationships if they don’t exist.

Room for improvement

Here’s some extensions that may be useful for individual projects

  1. Implement missing datatypes.
  2. Create custom inserts for known field names to give more realistic data for your application
  3. Randomise the character data properly.

So that’s it. Live it, love it and fiddle about with it a bit.

Useful links

The official documentation for the INFORMATION_SCHEMA used in this procedure is here
https://msdn.microsoft.com/en-us/library/ms186778.aspx

As ever the source code is on GitHub. There is nothing more than what is here but my WordPress code formatting plugin hasn’t made a great job of formatting the SQL code so the GitHub link might give clearer view of the SQL. Also, it seemed cruel not to give it a home on my demo repository with the rest of my code samples.
https://github.com/timbrownls20/Demo/blob/master/T-SQL/usp_RandomInsertGenerator.sql

 

 

LEAVE A RESPONSE

Your email address will not be published. Required fields are marked *