Databases

SQL

MariaDB

Installation

CentOS [1]:

$ sudo vim /etc/yum.repos.d/mariadb.repo
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.1/centos7-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1
$ sudo yum install MariaDB-server MariaDB-client

Configuration Tuning

Here are common optimizations for MariaDB’s configuration. [12][13][14]

# Enable query cache.
query_cache_type = 1
# Increase in increments of 32M.
query_cache_size = 128M
# Increase temporary table size. Set these two options to the same value.
tmp_table_size = 1G
max_heap_table_size = 1G
# Set to ~85% of available RAM if it is the only major service.
innodb_buffer_pool_size = 6.8G
# Avoid reverse DNS lookups.
skip-name-resolve

Adjust the thread_cache_size until the percentage calculated below is as close to 100% as possible. [15]

mysql> SHOW STATUS LIKE 'Threads_created';
mysql> SHOW STATUS LIKE 'Connections';
mysql> SELECT 100 - (( <THREADS_CREATED> / <CONNECTIONS> ) * 100) as "Threads_created % of Connections"\G

The MySQLTuner script [https://github.com/major/MySQLTuner-perl] can also help to automatically find settings that should be changed based on MariaDB’s recent utilization.

Use the MySQL Calculator [https://www.mysqlcalculator.com/] to ensure that the new settings will not use more than the available RAM on the system.

Table Engines

A full list of the available engines are provided at https://mariadb.com/kb/en/mariadb/storage-engines.

CassandraSE

The Cassandra Storage Engine (CassandraSE) is used to connect to a NoSQL Cassandra cluster. This allows the relational management of MariaDB to work with the fast and scalable Cassandra server. [2]

The Cassandra storage engine is missing from the official RHEL 7 repositories but can be installed from the RHEL 6 repository. [4]

$ sudo yum install http://yum.mariadb.org/10.1/centos6-amd64/rpms/MariaDB-10.1.18-centos6-x86_64-cassandra-engine.rpm

As the root MariaDB user, load the Cassandra library.
```
$ sudo mysql
> install soname 'ha_cassandra.so';
```

Configure the default Cassandra rpc/client IP to connect to.

> SET GLOBAL cassandra_default_thrift_host='<CASSANDRA_IP>'

The Cassandra table has to be created with the “COMPACT STORAGE” option. Otherwise, MariaDB will be unable to access the table properly.
```
$ cqlsh
> CREATE TABLE <NAME> (<VALUE1> <TYPE> PRIMARY KEY, <VALUE2> <TYPE>) WITH COMPACT STORAGE;
```
Create a Cassandra table within a database. MariaDB needs to know the keyspace and the column/table to map to as well as what IP address to use to connect to Cassandra.
```
> CREATE TABLE <NAME> ENGINE=cassandra KEYSPACE="<KEYSPACE_NAME>" COLUMN_FAMILY="<COLUMN_FAMILY_NAME>";
```

[2][3]

Clustering

Maxscale

MaxScale is a proxy that can load balance requests in different ways. This is useful for specifying reads and writes to specific servers. [5]

Configuration

The latest version of MariaDB’s MaxScale can be found at https://mariadb.com/downloads/maxscale.

RHEL Install [6]:

$ sudo yum install https://downloads.mariadb.com/MaxScale/2.0.1/rhel/7/x86_64/maxscale-2.0.1-1.rhel.7.x86_64.rpm

MaxScale requires the configuration of a listener that is associated with a router that serves requests to/from a list of servers.

Listener options:

[NAME]
type=listener
service = Specify the back-end service to use. This is usually a router.
protocol=MySQLClient
address = Specify the address to listen on.
port=3306

Listener example:

[listener]
type=listener
service="Read Write Service"
protocol=MySQLClient
port=3306

Router options:

[NAME]
type=service
router
- readconnroute = Load balance requests.
- readwritesplit = Send write requests to one node and read queries to all nodes.
- schemarouter = Shard databases. Requests to a particular database will be routed to a specific server.
- binlogrouter = Copy binary logs from servers other servers. If a back-end server fails, MaxScale will replace it and serve read requests from the available binary log.
router_options
- master = Write only.
- slave = Read only.
- master,slave = Read and write.
servers = A comma separated list of back-end servers.
user = Specify a MySQL user to connect as.
passwd = Specify the password for the MySQL user.

Example:

[Read Write Service]
type=service
router=readwritesplit
servers=server1,server2,server3
user=maxscale
passwd=123456

Server options:

[NAME]
type=server
address = Specify the address of the MySQL server.
port= Specify the MySQL port (default: 3306).
protocol=MySQLBackend

[7]

For replication, a maxscale MySQL user needs “REPLICATION SET” and “SELECT” grants for all databases.

GRANT REPLICATION SET, SELECT ON *.* TO 'maxscale'@'%' IDENTIFIED BY 'securepassword123';

In a master-slave configuration, at least two servers are required to be running. This is because MaxScale is unsure if other nodes are present and cannot determine if a server is a master or a slave. This will prevent it from working properly and this error will occur for all connections. [8] It is ideal to follow the quorum theory by having 3 servers to support a failed host properly.

ERROR 1045 (28000): failed to create new session

$ sudo yum install java-1.8.0-openjdk

File: /etc/yum/repos.d/datastax.repo

[datastax]
name = DataStax Repo for Apache Cassandra
baseurl = http://rpm.datastax.com/community
enabled = 1
gpgcheck = 0

$ sudo yum install cassandra30
$ sudo systemctl daemon-reload

Configuration options:

cluster_name = The unique name for a cluster. Default: Test Cluster.
listen_address = The IP address to listen on for clustering. Default: localhost.
listen_interface = The network interface to listen on for clustering. Default: eth0.
rpc_address = The IP address to listen on for client requests. Default: localhost.
rpc_interface = The network interface to listen on for client requests. Default: eth1.
start_rpc = Start the client service to allow incoming connections. Default: false.
disk_optimization_strategy = Specify the type of disk to optimize reads/writes for. Default: ssd.
- ssd = Solid state drivers.
- spinning = Spinning disk hard drives.
disk_failure_policy = The action to take when a disk is missing or in a failed state. Default: stop.
- best_effort = Do not use the disk but attempt to respond to requests with any data available.
- die = Kill off all processes.
- ignore = Ignore any major I/O errors and provide failure responses to any requests.
- stop = Gracefully stop the service.
endpoint_snitch = Select a snitch interface for clustering.
- CloudstackSnitch = Integrate with the Apache Cloudstack.
- Ec2Snitch = Cluster based on Amazon EC2 regions and compute availability zones.
- Ec2MultiRegionSnitch = Allows multiple Amazon EC2 regions to be used via public floating IPs.
- GoogleCloudSnitch = Cluster based on the Google Cloud Platform’s regions and compute availability zones.
- GossipingPropertyFileSnitch = Cluster based on the datacenter and rack location. Recommended for a multidatacenter cluster.
- RackInferringSnitch = Similar to GossipingPropertyFileSnitch except that the datacenter is automatically determined by the 2nd octet of the IP and the rack is determined by the 3rd.
- SimpleSnitch = Cluster based on proximity, but datacenter and rack location does not matter. Recommended for clusters in one region.
seed_provider = The IP addresses of Cassandra servers in other datacenters to replicate to. At least one node should be a seed provider in every datacenter. Not all nodes should be seed providers due to that leading to performance issues.
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
  - parameters:
  - seeds: “<IP_ADDRESS_1>”, “<IP_ADDRESS_2>”
concurrent_reads = Default: 32. Recommended: (16 * <COUNT_OF_DISKS>).
concurrent_writes = Default: 32. Recommended: (16 * <COUNT_OF_CPU_CORES>).
concurrent_counter_writes = Default: 32. Recommended: 16 * <COUNT_OF_DISKS>).
concurrent_batchlog_writes = Default: 32. Recommended: (16 * <COUNT_OF_CPUS>).
concurrent_materialized_view_writes = Default: 32. Recommended: Use less than the concurrent reads/writes.
incremental_backups = Default: false. Choose whether or not to use incremental backups. When taking snapshots, hardlinks will be used to refer back to old data for efficient backups.
snapshot_before_compact = Default: false. Choose whether or not to automatically take backups before running a compaction.

[11]

History

Latest [https://github.com/LukeShortCloud/rootpages/commits/main/src/http/databases.rst]
< 2020.01.01 [https://github.com/LukeShortCloud/rootpages/commits/main/src/administration/databases.rst]
< 2019.01.01 [https://github.com/LukeShortCloud/rootpages/commits/main/src/databases.rst]
< 2018.01.01 (NoSQL) [https://github.com/LukeShortCloud/rootpages/commits/main/markdown/nosql.md]
< 2018.01.01 (SQL) [https://github.com/LukeShortCloud/rootpages/commits/main/markdown/sql.md]

Bibliography

“Installing MariaDB with yum.” MariaDB Knowledgebase. Accessed October 16, 2016. https://mariadb.com/kb/en/mariadb/yum/
“Cassandra Storage Engine Overview.” MariaDB Knowledgebase. Accessed October 16, 2016. https://mariadb.com/kb/en/mariadb/cassandra-storage-engine-overview/
“Cassandra Storage Engine Use Example.” MariaDB Knowledgebase. Accessed October 16, 2016. https://mariadb.com/kb/en/mariadb/cassandra-storage-engine-use-example/
“Missing CentOS7 RPM: MariaDB-10.1.16-centos7-x86_64-cassandra-engine.rpm?” MariaDB Knowledgebase. Accessed October 16, 2016. https://mariadb.com/kb/en/mariadb/missing-centos7-rpm-mariadb-10116-centos7-x86_64-cassandra-enginerpm/
“About MariaDB MaxScale.” MariaDB Knowledgebase. Accessed October 16, 2016. https://mariadb.com/kb/en/mariadb-enterprise/about-mariadb-maxscale/
“MariaDB MaxScale Installation Guide.” MariaDB Knowledgebase. Accessed October 22, 2016. https://mariadb.com/kb/en/mariadb-enterprise/mariadb-maxscale-14/mariadb-maxscale-installation-guide/
“MaxScale Configuration & Usage Scenarios.” MariaDB Knowledgebase. Accessed October 22, 2016. https://mariadb.com/kb/en/mariadb-enterprise/mariadb-maxscale-14/maxscale-configuration-usage-scenarios/
“Issue with MaxScale when slaves are broken.” MaxScale Google Groups. August 28, 2014. Accessed November 12, 2016. https://groups.google.com/forum/#!topic/maxscale/HK49D15s21s
“How To Install Cassandra on CentOS 7” liquidweb Knowledgebase. Accessed October 16, 2016. https://www.liquidweb.com/kb/how-to-install-cassandra-on-centos-7/
“Installing the DataStax Distribution of Apache Cassandra 3.x on RHEL-based systems.” DataStax Distribution of Apache Cassandra 3 Documentation. October 14, 2016. Accessed October 16, 2016. http://docs.datastax.com/en/cassandra/3.x/cassandra/install/installRHEL.html
“The cassandra.yaml configuration file.” DataStax Documentation. Accessed February 8, 2018. http://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configCassandra_yaml.html
“Get the Best Out of MariaDB with Performance Tuning.” Open Source For You. May 1, 2017. Accessed May 11, 2020. https://opensourceforu.com/2017/05/get-best-mariadb-performance-tuning/
“Calculating InnoDB Buffer Pool Size for your MySQL Server.” ScaleGrid Blog. March 28, 2018. Accessed May 11, 2020. https://scalegrid.io/blog/calculating-innodb-buffer-pool-size-for-your-mysql-server/
“skip-name-resolve to speed up MySQL and avoid problems.” VION Technology Blog. September 18, 2012. Accessed May 11, 2020. https://www.vionblog.com/skip-name-resolve-to-speed-up-mysql-and-avoid-problems/
“MySQL Optimization Tip - thread_cache_size.” Another MySQL DBA. September 2, 2013. http://anothermysqldba.blogspot.com/2013/09/mysql-optimization-tip-threadcachesize.html