Sunday, 21 May 2017

nodejs mysql memory leak on large volume of high frequency queries

TL;DR

https://gist.github.com/anonymous/1e0faaa99b8bb4afe3749ff94e52c84f - Demonstrates memory consumption implying a leak. Is this in my code or in the mysql package?

Full version :

I am seeing lot of memory leak (and eventual crash every few hours) in my program. This program receives a lot of data over an UDP socket, stores the relevant bits in an in-memory Hash and writes the data in his Hash to a Mysql DB (AWS RDS instance) once every 10 seconds. node.js version is 6.9.4, running on RHEL 7.1

I tried to do some profiling using the "--inspect" option and Chrome devtools and my initial suspicion is the mysql package. To that end I made a simple node.js program which just makes a lot of queries to a local DB and observed that it indeed consumes a lot of memory very fast. Here is the program : https://gist.github.com/anonymous/1e0faaa99b8bb4afe3749ff94e52c84f

I tried a few variations of the program and all of them consumed memory at a rate which was clearly pointing towards an Out of Memory crash. The variations are :

  1. Use a single connection
  2. Use a pool with 30 connections (This is the production setup)
  3. Use a valid query statement
  4. Use an invalid query statement that results in a parse error (The space before the string 123 on line 27 makes it a bad query. Removing the space makes is a valid query)

This above program has nothing like the in-memory DB. It does one thing only : Issue a lot of UPDATE queries to the Mysql DB at a high frequency.

I have set the frequency to 2 seconds to demonstrate the memory consumption easily. Decreasing this frequency will slow down the memory consumption but it will nevertheless be growing. It only delays the crash, but the crash itself is inevitable.

The real usage requirement for frequency is 10 seconds and number of update queries during each run will go up to 10,000. So the numbers in the sample program are pretty close to the real world scenario and are not just some hypothetical simulation numbers.

Here is the output of "--trace-gc" which shows that the memory consumption rises to 400MB within a minute's time :

[29766:0x36c5120]    52326 ms: Scavenge 324.9 (365.1) -> 314.7 (369.1) MB, 8.3 / 0.0 ms [allocation failure].
[29766:0x36c5120]    53292 ms: Scavenge 330.3 (370.1) -> 317.4 (372.1) MB, 3.3 / 0.0 ms [allocation failure].
[29766:0x36c5120]    53477 ms: Scavenge 333.4 (374.1) -> 329.0 (375.1) MB, 15.6 / 0.0 ms [allocation failure].
[29766:0x36c5120]    53765 ms: Scavenge 335.5 (375.1) -> 331.9 (385.1) MB, 20.8 / 0.0 ms [allocation failure].
[29766:0x36c5120]    54701 ms: Scavenge 346.4 (386.1) -> 334.4 (388.1) MB, 5.3 / 0.0 ms [allocation failure].
[29766:0x36c5120]    55519 ms: Scavenge 349.9 (389.1) -> 338.9 (390.1) MB, 5.7 / 0.0 ms [allocation failure].
[29766:0x36c5120]    55614 ms: Scavenge 353.1 (392.1) -> 350.0 (395.1) MB, 17.8 / 0.0 ms [allocation failure].
[29766:0x36c5120]    56081 ms: Scavenge 356.8 (395.1) -> 351.3 (405.1) MB, 18.5 / 0.0 ms [allocation failure].
[29766:0x36c5120]    57195 ms: Scavenge 367.5 (406.1) -> 354.2 (407.1) MB, 3.2 / 0.0 ms (+ 20.1 ms in 236 steps since last GC) [allocation failure].
[29766:0x36c5120]    57704 ms: Scavenge 369.9 (408.1) -> 362.8 (410.1) MB, 12.5 / 0.0 ms (+ 15.7 ms in 226 steps since last GC) [allocation failure].
[29766:0x36c5120]    57940 ms: Scavenge 372.6 (412.1) -> 369.2 (419.1) MB, 21.6 / 0.0 ms (+ 8.5 ms in 139 steps since last GC) [allocation failure].
[29766:0x36c5120]    58779 ms: Scavenge 380.4 (419.1) -> 371.1 (424.1) MB, 11.4 / 0.0 ms (+ 11.3 ms in 165 steps since last GC) [allocation failure].
[29766:0x36c5120]    59795 ms: Scavenge 387.0 (426.1) -> 375.3 (427.1) MB, 10.6 / 0.0 ms (+ 14.4 ms in 232 steps since last GC) [allocation failure].
[29766:0x36c5120]    59963 ms: Scavenge 392.0 (431.3) -> 388.8 (434.3) MB, 19.1 / 0.0 ms (+ 50.5 ms in 207 steps since last GC) [allocation failure].
[29766:0x36c5120]    60471 ms: Scavenge 395.4 (434.3) -> 390.3 (444.3) MB, 20.2 / 0.0 ms (+ 19.2 ms in 96 steps since last GC) [allocation failure].
[29766:0x36c5120]    61781 ms: Scavenge 406.2 (446.3) -> 393.0 (447.3) MB, 3.7 / 0.0 ms (+ 107.6 ms in 236 steps since last GC) [allocation failure].
[29766:0x36c5120]    62107 ms: Scavenge 409.0 (449.3) -> 404.1 (450.3) MB, 16.0 / 0.0 ms (+ 41.0 ms in 227 steps since last GC) [allocation failure].
[29766:0x36c5120]    62446 ms: Scavenge 411.3 (451.3) -> 407.7 (460.3) MB, 22.6 / 0.0 ms (+ 20.2 ms in 103 steps since last GC) [allocation failure].

Questions :

  1. Is this kind of memory consumption expected for so many queries or is this indicative of a leak?
  2. Is my code leaking the memory? Anything obvious that I have missed? Am I using the package in a wrong way?
  3. if this is indeed a leak in the package, are there any immediate workarounds until the leak is fixed?

I am more than happy to provide any other information that is needed to get to the root of this. Please let me know.



via brahmana

No comments:

Post a Comment