Cheraq.com

I innovate, then I am!

LINQ and Inconsistency in pagination result

| 0 comments

LINQ is one of the significant improvements of .Net and from the time it has been added till today I am always using it in all my projects (I forgot how I was using DB before!). But recently I have faced a weird issue. This is not a LINQ issue, but you need to keep it into consideration. Let’s say we have a table with below structure and data inside:

CREATE TABLE [dbo].[Test](
[Id] [INT] IDENTITY(1,1) NOT NULL,
[NDate] [datetime] NOT NULL,
CONSTRAINT [PK_Test] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
SET IDENTITY_INSERT [dbo].[Test] ON
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (1, '2011-08-24 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (2, '2011-08-24 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (3, '2011-08-24 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (4, '2011-08-24 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (5, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (6, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (7, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (8, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (9, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (10, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (11, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (12, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (13, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (14, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (15, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (16, '2011-08-21 00:00:00.000')
SET IDENTITY_INSERT [dbo].[Test] OFF

Now if you want to run below LINQ operation for pagination, you will be surprised:

using(var db = new TestDataContext())
{
	var page_1_items = db.Test.OrderByDescending(i=> i.NDate).Take(10).ToList();
	var page_2_items = db.Test.OrderByDescending(i=> i.NDate).Skip(10).Take(10).ToList();
}

The page_1_items will be as following:

Id  NDate
4   2011-08-24 00:00:00.000
3   2011-08-24 00:00:00.000
2   2011-08-24 00:00:00.000
1   2011-08-24 00:00:00.000
11  2011-08-21 00:00:00.000
10  2011-08-21 00:00:00.000
9   2011-08-21 00:00:00.000
8   2011-08-21 00:00:00.000
7   2011-08-21 00:00:00.000
6   2011-08-21 00:00:00.000

And page_2_items will be:

Id  NDate
11  2011-08-21 00:00:00.000
12  2011-08-21 00:00:00.000
13  2011-08-21 00:00:00.000
14  2011-08-21 00:00:00.000
15  2011-08-21 00:00:00.000
16  2011-08-21 00:00:00.000

Did you noticed that some items are missing (item with ID 5)? and also item with ID equals to 11 will be repeated two times! Well, this was surprising me as well. The reason behind this is, generated query to load first page uses TOP X, but for other pages uses ROW_NUMBER(). This is for improving the performance, but this can make trouble for you as well. Below you can find generated queries:

/* For first page */
SELECT TOP 10 [Id],[NDate] FROM [Test] ORDER BY NDate DESC
 
/* For second page */
SELECT [Id[,[NDate] FROM (
  SELECT ROW_NUMBER() OVER (ORDER BY NDate DESC) AS RNumber, Id,NDate
  FROM [Test]) AS t
WHERE RNumber BETWEEN 11 AND 20
 
/* Supposed to return same result as TOP 10, but it is different!! */
SELECT [Id[,[NDate] FROM (
  SELECT ROW_NUMBER() OVER (ORDER BY NDate DESC) AS RNumber, Id,NDate
  FROM [Test]) AS t
WHERE RNumber BETWEEN 1 AND 10

To solve the issue, you need to add secondary ordering statement to ORDER BY statement (in this case we need to add Id as well). So whenever you are sorting your items and performing pagination, consider secondary ordering as well. This is not a perfect solution, because there are lots of cases where you want to allow user to choose ordering column (like when listing items in grid view, etc.). If you are wondering why LINQ is behaving like this, there might be some performance considerations in doing like this, since using TOP X does not require any other operation and it is more likely to be used (usually you want to show most viewed or commented or recent items, which can be addressed by TOP X easily).

Leave a Reply

Required fields are marked *.

*