Skip to content

Connection Error Fix#607

Closed
Saurav-D wants to merge 1 commit intolanpa:masterfrom
Saurav-D:fix_connection_error
Closed

Connection Error Fix#607
Saurav-D wants to merge 1 commit intolanpa:masterfrom
Saurav-D:fix_connection_error

Conversation

@Saurav-D
Copy link
Copy Markdown

@Saurav-D Saurav-D commented Oct 14, 2020

Adding on to the PR: #555.
Context: .flush() in _EventLoggerThread create a new connection each time, if there is fluctuation in connection S3 or GCS throws an error and since it is not handled the thread will hang and since the queue is full the training will also hang. The try block added will prevent the thread from getting stuck, instead it waits for the connection to appear again. Since it's a while loop the training wont resume till the connection is established again. Connection variable will make sure the print happens only once.

Refer this issue for more details: #606

I'm unsure if this is the right place to catch the error maybe it can be done individually in GCS and S3 writer.

@codecov-io
Copy link
Copy Markdown

codecov-io commented Oct 14, 2020

Codecov Report

Merging #607 into master will decrease coverage by 0.17%.
The diff coverage is 56.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #607      +/-   ##
==========================================
- Coverage   80.73%   80.56%   -0.18%     
==========================================
  Files          39       39              
  Lines        2824     2835      +11     
==========================================
+ Hits         2280     2284       +4     
- Misses        544      551       +7     
Impacted Files Coverage Δ
tensorboardX/event_file_writer.py 89.74% <56.25%> (-5.54%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34d1616...7cab42f. Read the comment docs.

@Saurav-D Saurav-D closed this Oct 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants