
Hi all Hopefully some light can be shed on the problem we're experiencing - otrs is rapidly becoming unusable in our case (and we *really* don't want to abandon it). This is either a database/sql issue, or some other issue related to otrs. Scenario: agent has approx. 315 locked tickets, with 292-odd tickets in 'new messages' queue. Clicking on the 'locked tickets' queue (to display the tickets) results in a delay of 30-60 minutes... Opening a normal queue with 17 tickets results in a delay of about 45 seconds. So, quite a problem. The following is our setup: apache 1.3.31, mod_perl, mod_php, mod_* etc. Documented mod_perl/otrs setup was followed. Using Postgres 7.4.2. With Postgresql config: log_statement = true log_duration = true I see zillions of the following: SELECT sdt.name, sa.create_by, st.until_time FROM article_sen der_type sdt, article sa, ticket st WHERE st.id = 3604 AND sa.ticket_id = st.id AND sdt.id = sa.article_sender_type_id ORDER BY sa.create_time LOG: duration: 533.672 ms - 800ms SELECT sa.ticket_id, sa.a_from, sa.a_to, sa.a_cc, sa.a_subject, sa.a_reply_to, sa. a_message_id, sa.a_body, st.create_time_unix, t.ticket_state_id, st.queue_id, sa.create_time, sa.a_content_type, sa.create_by, st.tn, article_sender_type_id, st.customer_id, st.until_time, st.ticket_priority_id, st.customer_user_id, st.user_id, su.login, sa.article_type_id, sa.a_freekey1, sa.a_freetext1, sa.a_freekey2, sa.a_freetext2, sa.a_freekey3, sa.a_freetext3, st.ticket_answered, sa.incoming_time, sa.id, st.freekey1, st.freetext1, st.freekey2, st.freetext2, st.freekey3, st.freetext3, st.freekey4, st.freetext4, st.freekey5, st.freetext5, st.freekey6, st.freetext6, st.freekey7, st.freetext7, st.freekey8, st.freetext8 FROM article sa, ticket st, system_user su where sa.id = 44681 AND sa.ticket_id = st.id AND st.user_id = su.id ORDER BY sa.id ASC LOG: duration: 809.421 ms - 1000ms Is otrs performing single and seperate selects for every blimming ticket in the set in question? Tweaking PG parameters really has no impact on this. The server in question is reasonably busy, with lots of memory, a fast processor, etc. Load avg varies between .3 - 1.5. The same server handles bugzilla with stellar performance. What the hell am I missing here? I refuse to give up on otrs just yet. Any help would be appreciated. Henry -------------------------------------------------------- This message was sent using MetroWEB's AirMail service. http://www.metroweb.co.za/ - full access for only R73. Free Web Accelerator, WebMail, Calendar, Anti-Virus, Anti-Spam, 10 emails, 100MB personal webspace, and more! Phone Now! 086 11 11 440

* Henry Combrinck
apache 1.3.31, mod_perl, mod_php, mod_* etc. Documented mod_perl/otrs setup was followed. Using Postgres 7.4.2.
With Postgresql config: log_statement = true log_duration = true
I see zillions of the following:
SELECT sdt.name, sa.create_by, st.until_time FROM article_sen der_type sdt, article sa, ticket st WHERE st.id = 3604 AND sa.ticket_id = st.id AND sdt.id = sa.article_sender_type_id ORDER BY sa.create_time LOG: duration: 533.672 ms - 800ms
This might not solve all your issues, but I'd suggest changing the types of the serial number columns so that your indexes actually come into play. You might want to ask postgres to explain its plan for the query above to confirm that this is a problem for you ("EXPLAIN SELECT sdt.name..."), but I'll bet it is. Se also the thread http://thread.gmane.org/gmane.comp.otrs.devel/530, where Robert Mathews outlines a way to change the type of these columns in-place. (He does this to harmonize the columns to bigint; I'd suggest going from bigint to conventional int.) Arne.

This might not solve all your issues, but I'd suggest changing the types of the serial number columns so that your indexes actually come into play. You might want to ask postgres to explain its plan for the query above to confirm that this is a problem for you ("EXPLAIN SELECT sdt.name..."), but I'll bet it is.
Se also the thread http://thread.gmane.org/gmane.comp.otrs.devel/530, where Robert Mathews outlines a way to change the type of these columns in-place. (He does this to harmonize the columns to bigint; I'd suggest going from bigint to conventional int.)
Thanks for the info Arne. Here's pg's explain output: Sort (cost=7132.87..7132.87 rows=1 width=3062) Sort Key: sa.id -> Nested Loop (cost=1716.31..7132.86 rows=1 width=3062) Join Filter: ("outer".ticket_id = "inner".id) -> Seq Scan on article sa (cost=0.00..5379.65 rows=2 width=1485) Filter: (id = 44681) -> Materialize (cost=1716.31..1724.51 rows=820 width=1581) -> Hash Join (cost=1.19..1716.31 rows=820 width=1581) Hash Cond: ("outer".user_id = "inner".id) -> Seq Scan on ticket st (cost=0.00..1652.28 rows=10928 width=1513) -> Hash (cost=1.15..1.15 rows=15 width=72) -> Seq Scan on system_user su (cost=0.00..1.15 rows=15 width=72) It does appear to be using the index "article_pkey" (sa.id) unless I'm reading the above incorrectly... I'll give the other link (bigint-int) a try and see, but I doubt this will have a major impact on the problem. Henry -------------------------------------------------------- This message was sent using MetroWEB's AirMail service. http://www.metroweb.co.za/ - full access for only R73. Free Web Accelerator, WebMail, Calendar, Anti-Virus, Anti-Spam, 10 emails, 100MB personal webspace, and more! Phone Now! 086 11 11 440

* Henry Combrinck
It does appear to be using the index "article_pkey" (sa.id) unless I'm reading the above incorrectly...
Well, no, it's doing sequential scans ("Seq Scan") on both article and ticket, and you really want to avoid that. Here's what it would look like (modulo other local configuration issues that might influence the plan layout) if the indexes were used: Sort (cost=10.70..10.72 rows=7 width=26) Sort Key: sa.create_time -> Nested Loop (cost=4.31..10.60 rows=7 width=26) -> Index Scan using ticket_pkey on ticket st (cost=0.00..3.07 rows=1 width=8) Index Cond: (id = 3604) -> Merge Join (cost=4.31..7.46 rows=7 width=26) Merge Cond: ("outer".id = "inner".article_sender_type_id) -> Index Scan using article_sender_type_pkey on article_sender_type sdt (cost=0.00..3.03 rows=3 width=14) -> Sort (cost=4.31..4.33 rows=7 width=18) Sort Key: sa.article_sender_type_id -> Index Scan using article_ticket_id on article sa (cost=0.00..4.22 rows=7 width=18) Index Cond: (ticket_id = 3604) As you can se, even Postgres' theoretical "cost" is several orders of magnitude lower. (This is with 7k/28k tickets/articles however, it seems you've got a tad more tickets in your database.)
I'll give the other link (bigint-int) a try and see, but I doubt this will have a major impact on the problem.
I'll be surprised if you don't see a noticeable improvement, but there might of course be other issues as well. Remember to tell postgres to "analyze". (Why this isn't done automatically is beyond me.) Arne.

* Henry Combrinck
It does appear to be using the index "article_pkey" (sa.id) unless I'm reading the above incorrectly...
Well, no, it's doing sequential scans ("Seq Scan") on both article and ticket, and you really want to avoid that. Here's what it would look like (modulo other local configuration issues that might influence the plan layout) if the indexes were used:
aah, crap, you're right! I had no idea pg7.4 ignored indexes if the types weren't *exactly* the same. system_user.id is integer... crikey, will need to monitor this and see what other tables are buggered. thanks for the pointers. Regards Henry -------------------------------------------------------- This message was sent using MetroWEB's AirMail service. http://www.metroweb.co.za/ - full access for only R73. Free Web Accelerator, WebMail, Calendar, Anti-Virus, Anti-Spam, 10 emails, 100MB personal webspace, and more! Phone Now! 086 11 11 440

* Henry Combrinck
aah, crap, you're right! I had no idea pg7.4 ignored indexes if the types weren't *exactly* the same.
Charming, innit?
system_user.id is integer...
Not to mention literal integers, which really bites...
crikey, will need to monitor this and see what other tables are buggered.
I've simply converted all bigint columns to int in my installations. No adverse effects so far. Arne.

* Henry Combrinck
aah, crap, you're right! I had no idea pg7.4 ignored indexes if the types weren't *exactly* the same.
Charming, innit?
system_user.id is integer...
Not to mention literal integers, which really bites...
crikey, will need to monitor this and see what other tables are buggered.
I've simply converted all bigint columns to int in my installations. No adverse effects so far.
You don't have a nice little alter...table(s) script you can paste herein, do you? ;-) I'm thinking the only way now is to dumpdb, vi:g/bigint/s//int/g, restore... and hope for the best :-( -------------------------------------------------------- This message was sent using MetroWEB's AirMail service. http://www.metroweb.co.za/ - full access for only R73. Free Web Accelerator, WebMail, Calendar, Anti-Virus, Anti-Spam, 10 emails, 100MB personal webspace, and more! Phone Now! 086 11 11 440

* Henry Combrinck
You don't have a nice little alter...table(s) script you can paste herein, do you? ;-)
I think Robert Mathews' snippet should work, but I havent tested it myself. If I generalize his code a bit, to change column "bar" in table "foo" to int4, something like this would be required: BEGIN; LOCK TABLE foo; ALTER TABLE foo ADD COLUMN new_col int; UPDATE foo SET new_col = bar; ALTER TABLE foo DROP COLUMN bar; ALTER TABLE foo RENAME COLUMN new_col TO bar; ALTER TABLE foo ADD PRIMARY KEY (bar); ALTER TABLE foo ALTER COLUMN bar SET DEFAULT nextval('"foo_bar_seq"'::text); COMMIT; ANALYZE; The exact contents of nextval(...) should probably be copied from "\d foo" beforehand. Keep your backups handy, and if you have the luxury to do so, try it on your test systems first... Arne.

* Henry Combrinck
You don't have a nice little alter...table(s) script you can paste herein, do you? ;-)
I think Robert Mathews' snippet should work, but I havent tested it myself. If I generalize his code a bit, to change column "bar" in table "foo" to int4, something like this would be required:
BEGIN; LOCK TABLE foo; ALTER TABLE foo ADD COLUMN new_col int; UPDATE foo SET new_col = bar; ALTER TABLE foo DROP COLUMN bar; ALTER TABLE foo RENAME COLUMN new_col TO bar; ALTER TABLE foo ADD PRIMARY KEY (bar); ALTER TABLE foo ALTER COLUMN bar SET DEFAULT nextval('"foo_bar_seq"'::text); COMMIT; ANALYZE;
The exact contents of nextval(...) should probably be copied from "\d foo" beforehand.
Keep your backups handy, and if you have the luxury to do so, try it on your test systems first...
Yup - that would work; too much work though. My 600MB dump has just finished. I'm going to dump the schema seperately and replace all bigint's with int, and restore. -------------------------------------------------------- This message was sent using MetroWEB's AirMail service. http://www.metroweb.co.za/ - full access for only R73. Free Web Accelerator, WebMail, Calendar, Anti-Virus, Anti-Spam, 10 emails, 100MB personal webspace, and more! Phone Now! 086 11 11 440

using a cast on the literal number: where sa.id = 44681::bigint AND sa.ticket_id = st.id AND st.user_id = su.id yields a far better result: Sort (cost=16.34..16.34 rows=3 width=2928) Sort Key: sa.id -> Hash Join (cost=1.19..16.31 rows=3 width=2928) Hash Cond: ("outer".user_id = "inner".id) -> Nested Loop (cost=0.00..15.08 rows=3 width=2917) -> Index Scan using article_pkey on article sa (cost=0.00..3.04 rows=2 width=1624) Index Cond: (id = 44681::bigint) -> Index Scan using ticket_pkey on ticket st (cost=0.00..6.01 rows=1 width=1301) Index Cond: ("outer".ticket_id = st.id) -> Hash (cost=1.15..1.15 rows=15 width=15) -> Seq Scan on system_user su (cost=0.00..1.15 rows=15 width=15) Dunno why a seq scan is being done on system_user (st.user_id and su.id are both integer). -------------------------------------------------------- This message was sent using MetroWEB's AirMail service. http://www.metroweb.co.za/ - full access for only R73. Free Web Accelerator, WebMail, Calendar, Anti-Virus, Anti-Spam, 10 emails, 100MB personal webspace, and more! Phone Now! 086 11 11 440

* Henry Combrinck
using a cast on the literal number:
where sa.id = 44681::bigint AND sa.ticket_id = st.id AND st.user_id = su.id
yields a far better result:
Sort (cost=16.34..16.34 rows=3 width=2928) Sort Key: sa.id -> Hash Join (cost=1.19..16.31 rows=3 width=2928) Hash Cond: ("outer".user_id = "inner".id) -> Nested Loop (cost=0.00..15.08 rows=3 width=2917) -> Index Scan using article_pkey on article sa (cost=0.00..3.04 rows=2 width=1624) Index Cond: (id = 44681::bigint) -> Index Scan using ticket_pkey on ticket st (cost=0.00..6.01 rows=1 width=1301) Index Cond: ("outer".ticket_id = st.id) -> Hash (cost=1.15..1.15 rows=15 width=15) -> Seq Scan on system_user su (cost=0.00..1.15 rows=15 width=15)
It seems Postgres 7.4.2 is able to use indexes when comparing two columns of type int/bigint; but not under all circumstances: testdb=> \d foo Table "public.foo" Column | Type | Modifiers --------+---------+----------- a | integer | Indexes: "foo_a" btree (a) testdb=> \d bar Table "public.bar" Column | Type | Modifiers --------+--------+----------- a | bigint | Indexes: "bar_a" btree (a) testdb=> explain select * from foo, bar where foo.a = bar.a; QUERY PLAN --------------------------------------------------------------------------- Nested Loop (cost=0.00..301630.10 rows=9732096 width=12) Join Filter: ("inner".a = "outer".a) -> Index Scan using bar_a on bar (cost=0.00..61.85 rows=3168 width=8) -> Index Scan using foo_a on foo (cost=0.00..56.79 rows=3072 width=4) (4 rows) testdb=> explain select * from foo, bar where foo.a = bar.a and bar.a = 2; QUERY PLAN ------------------------------------------------------------------------- Nested Loop (cost=100000000.00..100000057.62 rows=1 width=12) -> Index Scan using foo_a on foo (cost=0.00..2.01 rows=1 width=4) Index Cond: (2 = a) -> Seq Scan on bar (cost=100000000.00..100000055.60 rows=1 width=8) Filter: (a = 2) (5 rows) testdb=> explain select * from foo, bar where foo.a = bar.a and bar.a = int8(2); QUERY PLAN ------------------------------------------------------------------------- Nested Loop (cost=100000000.00..100000054.42 rows=1 width=12) -> Seq Scan on foo (cost=100000000.00..100000052.40 rows=1 width=4) Filter: (2::bigint = a) -> Index Scan using bar_a on bar (cost=0.00..2.01 rows=1 width=8) Index Cond: (a = 2::bigint) (5 rows) testdb=> explain select * from foo, bar where foo.a = int4(bar.a) and bar.a = int8(2); QUERY PLAN ----------------------------------------------------------------------------- Merge Join (cost=2.02..74.35 rows=16 width=12) Merge Cond: ("outer"."?column2?" = "inner".a) -> Sort (cost=2.02..2.03 rows=1 width=8) Sort Key: int4(bar.a) -> Index Scan using bar_a on bar (cost=0.00..2.01 rows=1 width=8) Index Cond: (a = 2::bigint) -> Index Scan using foo_a on foo (cost=0.00..56.79 rows=3072 width=4) (7 rows) testdb=> If one of the terms in the int/bigint clause is also compared to a literal somewhere else, this apparently also affects the way the first clause is being planned. Not always to good effect, it would seem. To get back on topic: in my view the root of the problem is the int4 literals, and I don't see any good way of getting around that which doesn't involve a lot of changes all over the OTRS code base. In addition, just casting literals to bigint all over is not going to be a good solution, since much of the code is supposed to be sql-dialect- neutral. I still think the easy (and correct, but that's perhaps more subjective) way out is to convert bigint to int all over the Postgres schema. Arne.

sorry - ignore my comment about system_user (integer is correct). :-) -------------------------------------------------------- This message was sent using MetroWEB's AirMail service. http://www.metroweb.co.za/ - full access for only R73. Free Web Accelerator, WebMail, Calendar, Anti-Virus, Anti-Spam, 10 emails, 100MB personal webspace, and more! Phone Now! 086 11 11 440
participants (2)
-
Arne Georg Gleditsch
-
Henry Combrinck