-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
In testing #3387 , I discovered a segfault related to delayed auth (which exists in the develop branch today). In handle__connect, this block crashes:
/* Check for an existing delayed auth check, reject if present */
HASH_FIND(hh_id, db.contexts_by_id_delayed_auth, context->id, strlen(context->id), found_context);
if(found_context){
rc = MOSQ_ERR_UNKNOWN;
goto handle_connect_error;
}The issue occurs when a client connects, delayed auth begins, and then the client disconnects & reconnects.
The root cause appears to be because on disconnect, the context is not removed from contexts_by_id_delayed_auth, but the context's hash metadata is wiped, causing the list to be corrupted.
Updating context__remove_from_by_id to always try removing the delayed auth regardless of in_by_id like so solves one manifestation of the segfault:
void context__remove_from_by_id(struct mosquitto *context)
{
struct mosquitto *context_found;
if(context->id){
HASH_FIND(hh_id_delayed_auth, db.contexts_by_id_delayed_auth, context->id, strlen(context->id), context_found);
if(context_found){
HASH_DELETE(hh_id_delayed_auth, db.contexts_by_id_delayed_auth, context_found);
}
if (context->in_by_id){
HASH_FIND(hh_id, db.contexts_by_id, context->id, strlen(context->id), context_found);
if(context_found){
HASH_DELETE(hh_id, db.contexts_by_id, context_found);
}
context->id_hashv = 0;
context->in_by_id = false;
}
}
}
But there are some cases that are less obvious to me on what to do, namely mosquitto_set_clientid where we only conditionally call context__remove_from_by_id. Perhaps here it's sufficient to simply remove+add the client from contexts_by_id_delayed_auth like what is done for contexts_by_id? Probably need @ralight 's input.