Skip to content

Conversation

@thorsteneckel
Copy link

Hi there lovely maintainers,

first of all: Thanks for this great gem! It does a great job over at zammad. However, we faced an issue in one of our customers installations. Long story short: The DB connection socket is closed (due to a restarting DB) while Delayed::Job reserves the job. Delayed::Job rescues the raised exception, logs it as INFO(!) and calls the recover_from method on the backend and then may try to re-run the job or skips it. An example log entry looks like this:

I, [2018-11-12T16:56:33.875406 #4559] INFO -- : 2018-11-12T16:56:33+0100: [Worker(host:some_name pid:1337)] Error while reserving job: PG::ConnectionBad: PQsocket() can't get socket descriptor: UPDATE "delayed_jobs" SET locked_at = '2018-11-12 15:56:33.873572', locked_by = 'host:some_name pid:1337' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-11-12 15:56:33.872595' AND (locked_at IS NULL OR locked_at < '2018-11-12 11:56:33.872665') OR locked_by = 'host:some_name pid:1337') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *

This gem does not utilize the Delayed::Job recover_from callback yet. This PR changes that to make sure the DB connection is still present after any exception was raised while processing a job. If the DB connection is lost and can't be reestablished a new exception will be raised and have to be handled accordingly.

Sadly I have no clue how to provide tests for this. I tried my best but haven't found out how. Please let me know how you would approach it and I'm happy to add those.

Greetings from Germany 👋

@thorsteneckel
Copy link
Author

The failing TravisCI jobs have different cause than the changes I introduced. Let me know if/how I can help to get this merged. Thanks!

sauy7 added a commit to fishbrain/delayed_job_active_record that referenced this pull request Sep 11, 2021
@jlahtinen
Copy link

I have tested this to be working for closed connections with oracledb and jruby 9.4.12.0

I think #227 should be merged in as well. And this gem should be start to use lease_connection https://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/ConnectionPool.html

@jlahtinen
Copy link

@thorsteneckel if you would like to add tests for this you could maybe check https://github.com/Shopify/toxiproxy

That could be one way to drop packages between db and activerecord temporarily while testing. I did not look how tests are implemented i assume they depends on database localhost on port X.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants