Add database index for significant improvement in user management


One particular query that gets executed by RosterItemProvider#getUsernames(String jid) takes a long time to execute. The getUserNames function is part of the 'remove user registration' routine'. It is used to clean up the roster of those users that have the user-to-be-deleted on their roster.

I'm using a copy of our data, running on my local machine, to do some benchmarking. Note that I'm using Postgres to illustrate this issue, but the same principle goes for all other DBMSses.

I was using this query in my tests:

Without any changes, this is the result from EXPLAIN (which gives you an indication of the costs of execution):

"Unique (cost=146804.89..146805.71 rows=1 width=12)"
" -> Sort (cost=146804.89..146805.30 rows=165 width=12)"
" Sort Key: username"
" -> Seq Scan on jiveroster (cost=0.00..146798.81 rows=165 width=12)"
" Filter: ((jid)::text = 'guus'::text)"

The query takes 6905ms to execute.

Next, I added an index to the "jid" column of the "jiveRoster" table. With the index, the result changes to this:

"Unique (cost=624.90..625.69 rows=1 width=12)"
" -> Sort (cost=624.90..625.30 rows=159 width=12)"
" Sort Key: username"
" -> Bitmap Heap Scan on jiveroster (cost=3.56..619.08 rows=159 width=12)"
" Recheck Cond: ((jid)::text = 'guus'::text)"
" -> Bitmap Index Scan on jiveroster_jid_idx (cost=0.00..3.56 rows=159 width=0)"
" Index Cond: ((jid)::text = 'guus'::text)"

Now, the query takes 10ms to execute.

Note the difference in time: 7 seconds versus 10 milliseconds!




Daniel Henninger


Guus der Kinderen