Procedimiento para recuperar MariaDB/MySQL con tablas en fallos que no se puede borrar ni reparar y que cuelgan el servidor al consultarse


#1

1. Referencias

  • MYSQL-FORCE-TABLE-DROP-INNODB-FAILURE-001
  • "MySQL server has gone away" después de ejecutar "show tables"
  • InnoDB: but tablespace with that id or name does not exist. Have you deleted or moved .ibd files?
  • Borrado forzado de tablas MySQL con backend InnoDB.

2. Introducción

  1. Requisitos: sólo aplica a bases de datos MySQL/MariaDB soportadas sobre InnoDB.

  2. Si una base de datos queda dañada de tal manera que simplemente conectar a ella produce fallos inmediatos:

    mysql> use dbname_pw
    Reading table information for completion of table and column names
    You can turn off this feature to get a quicker startup with -A
    
    Database changed
    mysql> show tables;
    ERROR 2006 (HY000): MySQL server has gone away
    No connection. Trying to reconnect...
    Connection id:    3
    Current database: dbname_pw
    
    ERROR 2006 (HY000): MySQL server has gone away
    No connection. Trying to reconnect...
    
  3. O también se estan observando fallos en los logs (/var/log/syslog o /var/log/mysql/error.log) como los siguientes:

    Nov  7 18:55:41 node10-server mysqld: InnoDB: Restoring possible half-written data pages from the doublewrite
    Nov  7 18:55:41 node10-server mysqld: InnoDB: buffer...
    Nov  7 18:55:41 node10-server mysqld: 231107 18:55:41  InnoDB: Error: table 'dbname_pw/ps_cart_rule'
    Nov  7 18:55:41 node10-server mysqld: InnoDB: in InnoDB data dictionary has tablespace id 286821,
    Nov  7 18:55:41 node10-server mysqld: InnoDB: but tablespace with that id or name does not exist. Have
    Nov  7 18:55:41 node10-server mysqld: InnoDB: you deleted or moved .ibd files?
    Nov  7 18:55:41 node10-server mysqld: InnoDB: This may also be a table created with CREATE TEMPORARY TABLE
    Nov  7 18:55:41 node10-server mysqld: InnoDB: whose .ibd and .frm files MySQL automatically removed, but the
    Nov  7 18:55:41 node10-server mysqld: InnoDB: table still exists in the InnoDB internal data dictionary.
    Nov  7 18:55:41 node10-server mysqld: InnoDB: Please refer to
    Nov  7 18:55:41 node10-server mysqld: InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
    Nov  7 18:55:41 node10-server mysqld: InnoDB: for how to resolve the issue.
    Nov  7 18:55:41 node10-server mysqld: 231107 18:55:41  InnoDB: Waiting for the background threads to start
    
    
  4. Generándose también backtraceses como el siguiente:

    Nov  7 20:00:58 node10-server mysqld: 19:00:58 UTC - mysqld got signal 11 ;
    Nov  7 20:00:58 node10-server mysqld: This could be because you hit a bug. It is also possible that this binary
    Nov  7 20:00:58 node10-server mysqld: or one of the libraries it was linked against is corrupt, improperly built,
    Nov  7 20:00:58 node10-server mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
    Nov  7 20:00:58 node10-server mysqld: We will try our best to scrape up some info that will hopefully help
    Nov  7 20:00:58 node10-server mysqld: diagnose the problem, but since we have already crashed,
    Nov  7 20:00:58 node10-server mysqld: something is definitely wrong and this may fail.
    Nov  7 20:00:58 node10-server mysqld:
    Nov  7 20:00:58 node10-server mysqld: key_buffer_size=16777216
    Nov  7 20:00:58 node10-server mysqld: read_buffer_size=131072
    Nov  7 20:00:58 node10-server mysqld: max_used_connections=3
    Nov  7 20:00:58 node10-server mysqld: max_threads=400
    Nov  7 20:00:58 node10-server mysqld: thread_count=2
    Nov  7 20:00:58 node10-server mysqld: connection_count=2
    Nov  7 20:00:58 node10-server mysqld: It is possible that mysqld could use up to
    Nov  7 20:00:58 node10-server mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 891396 K  bytes of memory
    Nov  7 20:00:58 node10-server mysqld: Hope that's ok; if not, decrease some variables in the equation.
    Nov  7 20:00:58 node10-server mysqld:
    Nov  7 20:00:58 node10-server mysqld: Thread pointer: 0x7fdf90074270
    Nov  7 20:00:58 node10-server mysqld: Attempting backtrace. You can use the following information to find out
    Nov  7 20:00:58 node10-server mysqld: where mysqld died. If you see no messages after this, something went
    Nov  7 20:00:58 node10-server mysqld: terribly wrong...
    Nov  7 20:00:58 node10-server mysqld: stack_bottom = 7fdf97a14e80 thread_stack 0x30000
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x29)[0x55fe04c62dc9]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x3d8)[0x55fe04b48dc8]
    Nov  7 20:00:58 node10-server mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0xf0a0)[0x7fdfe21b70a0]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(+0x690145)[0x55fe04db1145]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(+0x5fddb3)[0x55fe04d1edb3]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(+0x6291d5)[0x55fe04d4a1d5]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(+0x62971a)[0x55fe04d4a71a]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(+0x5a1bbd)[0x55fe04cc2bbd]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(_ZN7handler7ha_openEP5TABLEPKcii+0x3f)[0x55fe04b4cacf]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(_Z21open_table_from_shareP3THDP11TABLE_SHAREPKcjjjP5TABLEb+0x577)[0x55fe04acc8d7]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(_Z10open_tableP3THDP10TABLE_LISTP11st_mem_rootP18Open_table_context+0x950)[0x55fe04a17ee0]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(_Z11open_tablesP3THDPP10TABLE_LISTPjjP19Prelocking_strategy+0x56b)[0x55fe04a189eb]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(_Z20open_and_lock_tablesP3THDP10TABLE_LISTbjP19Prelocking_strategy+0x49)[0x55fe04a19609]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(+0x3da8a3)[0x55fe04afb8a3]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(_ZN21Check_table_statement7executeEP3THD+0xc1)[0x55fe04afcee1]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x15ff)[0x55fe04a4e0cf]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(+0x3305ee)[0x55fe04a515ee]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1af4)[0x55fe04a53944]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x105)[0x55fe04aef865]
    Nov  7 20:00:58 node10-server mysqld: /usr/sbin/mysqld(handle_one_connection+0x50)[0x55fe04aef980]
    Nov  7 20:00:58 node10-server mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x7fdfe21aeb50]
    Nov  7 20:00:58 node10-server mysqld: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fdfe0af2fbd]
    Nov  7 20:00:58 node10-server mysqld:
    Nov  7 20:00:58 node10-server mysqld: Trying to get some variables.
    Nov  7 20:00:58 node10-server mysqld: Some pointers may be invalid and cause the dump to abort.
    Nov  7 20:00:58 node10-server mysqld: Query (7fdf9007f0d0): is an invalid pointer
    Nov  7 20:00:58 node10-server mysqld: Connection ID (thread ID): 529
    Nov  7 20:00:58 node10-server mysqld: Status: NOT_KILLED
    Nov  7 20:00:58 node10-server mysqld:
    Nov  7 20:00:58 node10-server mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
    Nov  7 20:00:58 node10-server mysqld: information that should help you find out what is causing the crash.
    
  5. O también indicaciones de fallo recovery como:

    Nov  7 20:00:30 node10-server mysqld: InnoDB: Progress in percents: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
    Nov  7 20:00:30 node10-server mysqld: InnoDB: Apply batch completed
    Nov  7 20:00:30 node10-server mysqld: 231107 20:00:30  InnoDB: Error: table 'dbname_pw/ps_product_sale'
    Nov  7 20:00:30 node10-server mysqld: InnoDB: in InnoDB data dictionary has tablespace id 287075,
    Nov  7 20:00:30 node10-server mysqld: InnoDB: but tablespace with that id or name does not exist.        Have
    Nov  7 20:00:30 node10-server mysqld: InnoDB: you deleted or moved .ibd files?
    Nov  7 20:00:30 node10-server mysqld: InnoDB: This may also be a table created with CREATE TEMPORARY TABLE
    Nov  7 20:00:30 node10-server mysqld: InnoDB: whose .ibd and .frm files MySQL automatically removed, but the
    Nov  7 20:00:30 node10-server mysqld: InnoDB: table still exists in the InnoDB internal data dictionary.
    Nov  7 20:00:30 node10-server mysqld: InnoDB: Please refer to
    Nov  7 20:00:30 node10-server mysqld: InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
    Nov  7 20:00:30 node10-server mysqld: InnoDB: for how to resolve the issue.
    Nov  7 20:00:30 node10-server mysqld: 231107 20:00:30  InnoDB: Waiting for the background threads to start
    Nov  7 20:00:31 node10-server mysqld: 231107 20:00:31 InnoDB: 5.5.60 started; log sequence number 707763435131
    Nov  7 20:00:31 node10-server mysqld: 231107 20:00:31 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3306
    Nov  7 20:00:31 node10-server mysqld: 231107 20:00:31 [Note]   - '0.0.0.0' resolves to '0.0.0.0';
    Nov  7 20:00:31 node10-server mysqld: 231107 20:00:31 [Note] Server socket created on IP: '0.0.0.0'.
    Nov  7 20:00:31 node10-server mysqld: 231107 20:00:31 [Note] Event Scheduler: Loaded 0 events
    Nov  7 20:00:31 node10-server mysqld: 231107 20:00:31 [Note] /usr/sbin/mysqld: ready for        connections.
    Nov  7 20:00:31 node10-server mysqld: Version: '5.5.60-0+deb7u1-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Debian)
    Nov  7 20:00:32 node10-server mysqld: 231107 20:00:32  InnoDB: error: space object of table 'dbname_pw/ps_product_sale',
    Nov  7 20:00:32 node10-server mysqld: InnoDB: space id 287075 did not exist in memory. Retrying an open.
    Nov  7 20:00:32 node10-server mysqld: 231107 20:00:32  InnoDB: Operating system error number 2 in a file operation.
    Nov  7 20:00:32 node10-server mysqld: InnoDB: The error means the system cannot find the path specified.
    Nov  7 20:00:32 node10-server mysqld: 231107 20:00:32  InnoDB: Error: trying to open a table, but could not
    Nov  7 20:00:32 node10-server mysqld: InnoDB: open the tablespace file './dbname_pw/ps_product_sale.ibd'!
    Nov  7 20:00:32 node10-server mysqld: InnoDB: Have you moved InnoDB .ibd files around without using the
    Nov  7 20:00:32 node10-server mysqld: InnoDB: commands DISCARD TABLESPACE and IMPORT TABLESPACE?
    Nov  7 20:00:32 node10-server mysqld: InnoDB: It is also possible that this is a temporary table #sql...,
    Nov  7 20:00:32 node10-server mysqld: InnoDB: and MySQL removed the .ibd file for this.
    Nov  7 20:00:32 node10-server mysqld: InnoDB: Please refer to
    Nov  7 20:00:32 node10-server mysqld: InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
    Nov  7 20:00:32 node10-server mysqld: InnoDB: for how to resolve the issue.
    Nov  7 20:00:32 node10-server mysqld: 231107 20:00:32  InnoDB: Error: table 'dbname_pw/ps_product_sale'
    Nov  7 20:00:32 node10-server mysqld: InnoDB: in InnoDB data dictionary has tablespace id 287075,
    Nov  7 20:00:32 node10-server mysqld: InnoDB: but tablespace with that id or name does not exist. Have
    Nov  7 20:00:32 node10-server mysqld: InnoDB: you deleted or moved .ibd files?
    Nov  7 20:00:32 node10-server mysqld: InnoDB: This may also be a table created with CREATE TEMPORARY TABLE
    Nov  7 20:00:32 node10-server mysqld: InnoDB: whose .ibd and .frm files MySQL automatically removed, but the
    Nov  7 20:00:32 node10-server mysqld: InnoDB: table still exists in the InnoDB internal data dictionary.
    Nov  7 20:00:32 node10-server mysqld: InnoDB: Please refer to
    Nov  7 20:00:32 node10-server mysqld: InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
    Nov  7 20:00:32 node10-server mysqld: InnoDB: for how to resolve the issue.
    Nov  7 20:00:32 node10-server mysqld: InnoDB: We removed now the InnoDB internal data dictionary entry
    Nov  7 20:00:32 node10-server mysqld: InnoDB: of table `dbname_pw`.`ps_product_sale`.
    

Entonces revisar este procedimiento.

3. Procedimiento

  1. Confirmar que cuando estos errores se han observado, NO ha sido posible recuperarlos usando las indicaciones de:

  2. Tampoco se corrige usando los distintos niveles de innodb recovery mode:

  3. Para resolverlo, hay que identificar tabla a tabla las que están fallando, conseguir deshalojarlas, y a
    continuación usar una copia de seguridad para recuperar las mismas.

  4. Para ello, ejecutar el siguiente comando en un terminal:

    tail -f /var/log/syslog /var/log/mysql/error.log

    …y en otro terminal, el siguiente comando para localizar la tabla en fallo al ser comprobada:

    mysqlcheck --defaults-file=/etc/mysql/debian.cnf

    NOTA: este comando o cualquier otra técnica es válida para localizar las tablas. Si se saben cuáles son, por consultas concretas de fallos o inspección de los logs, saltar al paso siguiente directamente.

  5. Una vez localizada la tabla, borrar el fichero .ibd, conservando el .frm de la tabla afectada:

    # Hacemos copias copia de los ficheros que se van a borrar
    cp /var/lib/mysql/.* /root/

    # Borramos el fichero ibd (innodb)
    cd /var/lib/mysql/
    rm -f tabla_en_fallo.ibd # MUY IMPORTANTE, CONSERVAR EL tabla_en_fallo.frm

  6. A continuación, conectamos al mysql y ejecutamos lo siguiente hasta que haga drop table.
    NOTA: puede fallar en algunas ocasiones, insistir como se muestra, hasta que de “Unknown table” o similar:

    > mysql --defaults-file=/etc/mysql/debian.cnf
    
    mysql> use mysql
    Reading table information for completion of table and column names
    You can turn off this feature to get a quicker startup with -A
    
    Database changed
    mysql> drop table <base-de-datos-en-fallo>.<tabla-en-fallo>;
    ERROR 2013 (HY000): Lost connection to MySQL server during query
    
    mysql> drop table <base-de-datos-en-fallo>.<tabla-en-fallo>;
    ERROR 2006 (HY000): MySQL server has gone away
    No connection. Trying to reconnect...
    ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
    ERROR:
    Can't connect to the server
    
    mysql> drop table <base-de-datos-en-fallo>.<tabla-en-fallo>;
    No connection. Trying to reconnect...
    ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
    ERROR:
    Can't connect to the server
    
    mysql> drop table <base-de-datos-en-fallo>.<tabla-en-fallo>;
    No connection. Trying to reconnect...
    Connection id:    1
    Current database: mysql
    
    Query OK, 0 rows affected (0.02 sec)
    
    mysql> drop table <base-de-datos-en-fallo>.<tabla-en-fallo>;
    ERROR 1051 (42S02): Unknown table '<base-de-datos-en-fallo>.<tabla-en-fallo>'
    
  7. Una vez terminado, volver al paso 4 hasta que la comprobación de ok y termine sin romperse nada.

  8. A partir de aquí utilizar una copia de seguridad de /var/backups/mysql-backups/.txt.gz.

    Copiarla en un directorio independiente y con un editor seleccionar la tabla y los datos a importar.

    A continuación importar sobre la base de datos reparada para recuerar las tablas borradas.

4. Referencias

  1. Este procedimiento guarda similitudes con:
  • MYSQL-RECOVER-SPACE-HEADER-PAGE-CONSISTS-OF-ZERO-BYTES-IN-TABLESPACE-001