Saturday, February 25, 2012

Moving ntext to nvarchar(max)

I just move our SQL server to version 2005. In new version ntext field is deprecated and documentation says that ntext(max) should be used.

If I have table Table1 and ntext column Column1. When I execute following SQL statements:

alter table Table1 alter column

Column1 nvarchar(max)

go

1.) Are out of row data automatically move to in row?

2.) Or should I also execute something like this ?

update Table1 set Column1 = Column1+'' where Column1 is not null

3.) Is there way to check if data is stored out or in row?

Best regards

edvin

Hi,

see this article for more information:

http://msdn2.microsoft.com/en-US/library/ms189087.aspx

Find out the option on your table using sp_tableoption

HTH, Jens Suessmeyer.

http://www.sqlserver2005.de

|||

>> 1.) Are out of row data automatically move to in row?

No. ALTER TABLE operation is typically a metadata only operation. There are exceptions however where we have to rewrite every single row. But in this case, there is no reason to touch every single row and text page.

>> 2.) Or should I also execute something like this ?

>> update Table1 set Column1 = Column1+'' where Column1 is not null

You could, but I would recommend against doing this due to the logging requirements but it depends on the number of rows being updated and the percentage of data in the table that can be moved in-row. Future inserts will automatically be in-row depending on their length. Updates to values that are already stored out-of-row will be moved in-row depending on their length. You could recreate the table using SELECT INTO operation which will be more efficient than UPDATE but requires more logic. And you may also have to defrag the table if you update due to migration of the rows or fragmentation of pages.

>> 3.) Is there way to check if data is stored out or in row?

There is no easy way to check this. But you can use the query below to see the allocations happening in the table based on the data insertions. Below is some sample code that will demonstrate the whole process.

use tempdb

go

create table dbo.lobtest ( i int not null identity, t ntext not null );

insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024*100));

insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));

-- look at allocation entries (generic query to alloc units for a table):
select au.*
, coalesce(p1.object_id, p2.object_id) as object_id
, coalesce(p1.index_id, p2.index_id) as index_id
from sys.allocation_units as au
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.hobt_id = au.container_id and au.type in (1, 3)
) as p1
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.partition_id = au.container_id and au.type = 2
) as p2
where coalesce(p1.object_id, p2.object_id) = object_id('dbo.lobtest');

-- convert ntext column:
alter table dbo.lobtest alter column t nvarchar(max) not null;

-- look at allocation entries again:
select au.*
, coalesce(p1.object_id, p2.object_id) as object_id
, coalesce(p1.index_id, p2.index_id) as index_id
from sys.allocation_units as au
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.hobt_id = au.container_id and au.type in (1, 3)
) as p1
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.partition_id = au.container_id and au.type = 2
) as p2
where coalesce(p1.object_id, p2.object_id) = object_id('dbo.lobtest');

-- add more rows that should be inline
insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));

insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));

insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));

-- check allocation entries again (only IN_ROW_DATA pages increased):
select au.*
, coalesce(p1.object_id, p2.object_id) as object_id
, coalesce(p1.index_id, p2.index_id) as index_id
from sys.allocation_units as au
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.hobt_id = au.container_id and au.type in (1, 3)
) as p1
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.partition_id = au.container_id and au.type = 2
) as p2
where coalesce(p1.object_id, p2.object_id) = object_id('dbo.lobtest');

drop table dbo.lobtest;
go

|||

Hi,

thanks for your replies. But are you really sure that simple updating all rows will not move date from out row to in row. Because in SQL documentation (http://msdn2.microsoft.com/en-US/library/ms189087.aspx) states that:


When the large value types out of row option value is changed, existing varchar(max), nvarchar(max), varbinary(max), and xml values are not immediately converted. The storage of the strings is changed as they are subsequently updated. Any new values inserted into a table are stored according to the table option in effect.
-

From this I understand that updating row data will also change the storage. Recreating tables demands also recreating constraints, indexes..etc..

Best regards
edvin

|||The default for large value types is to store in-row. And when you convert from ntext to nvarchar(max) the default is to store in-row. I just tested and it seems like if you update the data it moves from out-of-row to in-row if the value fits in-row. If you want to see immediate benefits you will have to recreate the table or update the rows (this is more expensive due to logging requirements than using say SELECT...INTO but it depends on the number of rows you want to update in the table). Additionally, you may have to defragment the table if the rows are updated due to data movement.

No comments:

Post a Comment