I’ve often had the need to populate database tables with chunks of random data for testing and so forth and this random data generator stored procedure does the trick for me. I’ve had it knocking round in my sock draw for quite a while so I’ve dusted it off and updated it. It actually does a bit more than I remembered. I’ve used it on SQL Server 2005 and 2008 and the INFORMATION_SCHEMA use should make it forwardly compatible.
Code
usp_RandomInsertGenerator
This procedure actually does the heavy lifting and generates the insert statements.
Create Procedure usp_RandomInsertGenerator ( @TableName nvarchar(128) ) As -- does foreign keys -- max character length -- standard data types -- identity columns -- detects nullable columns and leaves as null -- does not do -- omitted datatypes - binary, varbinary, image, timestamp -- Hacks -- foreign key links where linked to child table via a unique index rather than a primary key. Assumpts FK of 1 -- assumes -- foreign keys are of a numeric type i.e. ints, tinyints, floats etc... -- assumes foreign key constraints exist Begin Set NoCount On Declare @ColumnName nvarchar(128) Declare @FK_TableName nvarchar(128) Declare @FK_ColumnName nvarchar(128) Declare @ConstraintName nvarchar(128) Declare @DataType nvarchar(128) Declare @CharacterMaximumLength int Declare @Sql nvarchar(max) Declare @MaxValue int Declare @InsertValue nvarchar(400) Declare @SqlOutputFields nvarchar(max) Declare @SqlOutputValues nvarchar(max) Declare @FirstLoop bit Declare @IsIdentity bit Declare @StringInsert bit Declare @DummyText varchar(48) Set @FirstLoop = 1 Set @SqlOutputFields = 'INSERT INTO ' + @TableName + ' (' Set @SqlOutputValues = ' VALUES (' Set @DummyText = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz' -- cursor loops through every column for named table Declare procCursor CURSOR FORWARD_ONLY For Select COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH From INFORMATION_SCHEMA.COLUMNS col Where TABLE_NAME = @TableName And IS_NULLABLE = 'NO' Order By ORDINAL_POSITION Open procCursor Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength while @@fetch_status <> -1 Begin -- datatypes i have not bothered implementing if @DataType = 'cursor' OR @DataType = 'timestamp' Or @DataType = 'binary' Or @DataType = 'varbinary' Or @DataType = 'image' Begin Raiserror('Unsupported Data Type', 1, 16) End --reset variables Set @FK_TableName = '' Set @FK_ColumnName = '' Set @StringInsert = 0 Set @ConstraintName = '' -- Do not add in an insert value if the loop is an identity Select @IsIdentity = COLUMNPROPERTY(OBJECT_ID(@TableName),@ColumnName,'IsIdentity') if @IsIdentity = 1 Begin Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength continue End -- getting the value to be inserted for this data type if @DataType = 'varchar' Or @DataType = 'nvarchar' Or @DataType = 'char' Or @DataType = 'nchar' Or @DataType = 'text' Or @DataType = 'ntext' Begin Set @StringInsert = 1 Set @InsertValue = SubString(@DummyText, 1, @CharacterMaximumLength) End Else if @DataType = 'datetime' Or @DataType = 'smalldatetime' Begin Set @StringInsert = 1 Set @InsertValue = Cast(GetDate() as varchar(20)) End Else if @DataType = 'uniqueidentifier' Begin Set @StringInsert = 1 Set @InsertValue = Cast(NewId() as varchar(200)) End Else -- it is some key of numeric type Begin -- getting the child table indexes Set @Sql = ' Select @FK_TableName = pkconst.TABLE_NAME, @FK_ColumnName = pkconst.COLUMN_NAME, @ConstraintName = coluse.CONSTRAINT_NAME From INFORMATION_SCHEMA.KEY_COLUMN_USAGE coluse Join INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS fkconst On fkconst.CONSTRAINT_NAME = coluse.CONSTRAINT_NAME Left Join INFORMATION_SCHEMA.KEY_COLUMN_USAGE pkconst On pkconst.CONSTRAINT_NAME = fkconst.UNIQUE_CONSTRAINT_NAME Where coluse.TABLE_NAME = @TableName And coluse.COLUMN_NAME = @ColumnName' Execute sp_executesql @Sql, N'@TableName nvarchar(128), @ColumnName nvarchar(128), @FK_TableName nvarchar(128) OUTPUT, @FK_ColumnName nvarchar(128) OUTPUT, @ConstraintName nvarchar(128) OUTPUT', @TableName=@TableName, @ColumnName=@ColumnName, @FK_TableName=@FK_TableName OUTPUT, @FK_ColumnName=@FK_ColumnName OUTPUT, @ConstraintName=@ConstraintName OUTPUT if Len(@FK_TableName) > 0 And Len(@FK_ColumnName) > 0 Begin /* have found foreign key and the lookup table so pick a random primary key from the lookup table */ Set @Sql = 'Select top 1 @InsertValue = Cast(' + @FK_ColumnName + ' as varchar(200)) From ' + @FK_TableName + ' Order By newid()' Execute sp_executesql @Sql, N'@InsertValue nvarchar(128) OUTPUT', @InsertValue=@InsertValue OUTPUT End Else if(Len(@ConstraintName) > 0) Begin /* OK we've found the foreign key constraint but have no idea what the lookup table is. This is because we are joining on a unique index to the lookup table not a primary key. Make a MASSIVE assumption in this instance - that the lookup table has a link value of 1 */ Set @InsertValue = '1' End Else Begin -- no foreign key so the max that can be inserted is based on the datatype -- do not bother with any thing greater than thirty thousand - big enough if @DataType = 'bit' Set @MaxValue = 1 else if @DataType = 'tinyint' Set @MaxValue = 255 else if @DataType = 'smallint' Set @MaxValue = 32767 else Set @MaxValue = 32767 -- randomly generate a number to insert up to maximum Set @InsertValue = Cast(ROUND(((@MaxValue - 1) * RAND() + 1), 0) as varchar(200)) End End -- end of numeric processing -- building up output string Declare @Delimiter char(1) if @FirstLoop = 1 Begin Set @FirstLoop = 0 Set @Delimiter = '' End Else Begin Set @Delimiter = ',' End Set @SqlOutputFields = @SqlOutputFields + @Delimiter + @ColumnName if @StringInsert = 1 Begin Set @SqlOutputValues = @SqlOutputValues + @Delimiter + '''' + @InsertValue + '''' End Else Begin Set @SqlOutputValues = @SqlOutputValues + @Delimiter + @InsertValue End Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength End -- finished this column = go to next close procCursor deallocate procCursor -- outputting the sql string Set @SqlOutputFields = @SqlOutputFields + ')' Set @SqlOutputValues = @SqlOutputValues + ')' select @SqlOutputFields + ' ' + @SqlOutputValues End Go
usp_RandomInsertGeneratorWrapper
It’s very easy to create a wrapper stored procedure to get multiple statements. One row of random data isn’t that much use – 1000 might be.
Create Procedure usp_RandomInsertGeneratorWrapper ( @TableName nvarchar(128), @NumberOfInserts int ) As Begin Set NoCount On Declare @StatementCount int Set @StatementCount = 0 While @StatementCount < @NumberOfInserts Begin exec usp_RandomInsertGenerator @TableName Set @StatementCount = @StatementCount + 1 End End
Usage
To get an insert for one line of data
exec usp_RandomInsertGenerator ‘MyTableName’
To get multiple insert statements call the wrapper function
usp_RandomInsertGeneratorWrapper ‘MyTableName’, 1000
Notes and disclaimers
What it does
- It outputs an insert T-SQL statement for any named table in the database.
- Detects foreign keys and inserts and makes a random but valid link to the child table (this was the hard part)
- Detect different data types and does the necessary SQL formatting
- Accounts for max character length in string type fields
- Accounts for identity columns
- Detects nullable columns and leaves them null
What it doesn’t do
- Doesn’t account for unique indexes
- There are omitted datatypes (cursor, timestamp, binary, varbinary, image) that I’m not really interested in right now.
Hacks and shortcuts
Where the foreign key links are linked to a child table via a unique index rather than a primary key it assumes a foreign key of 1. The INFORMATION_SCHEMA views don’t include index information. So this would necessitate delving into the sys.index table. Not impossible but the proc was complicated enough as it was. I didn’t need it so I didn’t do it.
Assumptions
- Foreign keys are of a numeric type i.e. int, float et. al.
- Foreign key constraints exist on the database. It isn’t magic – it can’t work out the relationships if they don’t exist.
Room for improvement
Here’s some extensions that may be useful for individual projects
- Implement missing datatypes.
- Create custom inserts for known field names to give more realistic data for your application
- Randomise the character data properly.
So that’s it. Live it, love it and fiddle about with it a bit.
Useful links
The official documentation for the INFORMATION_SCHEMA used in this procedure is here
https://msdn.microsoft.com/en-us/library/ms186778.aspx
As ever the source code is on GitHub. There is nothing more than what is here but my WordPress code formatting plugin hasn’t made a great job of formatting the SQL code so the GitHub link might give clearer view of the SQL. Also, it seemed cruel not to give it a home on my demo repository with the rest of my code samples.
https://github.com/timbrownls20/Demo/blob/master/T-SQL/usp_RandomInsertGenerator.sql