Saturday, January 9, 2021

 What is MongoDB FTDC  (Full Time Diagnostic Data Capture)? How does it impact your platform stability?

In simple words, MongoDB Inc. captures server behavior and stores it under a separate directory (data directory/diagnostic.data). This is mainly for MongoDB engineers, to process/analyze mongod and mongos when needed ( troubleshooting). FTDC data files are compressed and not human-readable. MongoDB Inc. engineers cannot access FTDC data without explicit permission and assistance from system owners or operators.


FTDC captures data every 1 second.


Sometimes this process leads to high resource consumption. In a few places, this process marked as overhead for MongoDB overhead.

i.e

https://bugzilla.redhat.com/show_bug.cgi?id=1834377


How ticket drops impacted this process,

This process/tread collects,

  • serverStatus
  • replSetGetStatus (mongod only)
  • collStats for the local.oplog.rs collection (mongod only)

Once our servers are getting load, capturing every 1 second all the above data, could lead to holding tickets longer time than usual.  


The only drawback of disabling this feature is,  troubleshooting time may increase from MongoDB Inc. 

by default this feature is enabled.

How to disable it:
  • Run time  ( Apply immediately but until next instance restart)
  • config level ( You need to restart the instance)

Runtime 
use admin
db.adminCommand( { setParameter: 1 , diagnosticDataCollectionEnabled : false } )


MongoDB config file

setParameter:
  diagnosticDataCollectionEnabled: false

There is a huge change in our platform