In the previous blog (How to deploy SQL Server containers to a Kubernetes cluster for high availability? – Beyond the Horizon… (livenetwork.in) we learned how to deploy a SQL server container to the Kubernetes Cluster. In this blog, we will extend the same concept to deploy the SQL server Always on Availability group on Azure Kubernetes Services. I have developed a video for this and you can watch this video on the below link.
What is SQL server Always on Availability Group?
An availability group solution supports a set of primary user databases and one to eight sets of corresponding secondary user databases(for fault tolerance). The secondary databases are kept up to date with the primary databases by transferring transaction log blocks from each primary database to every secondary database either synchronously or asynchronously over the network. It provides Database Level Protection. Data streams between replicas are encrypted with certificates for the purpose of security and Failover can be implemented either automatically or manually.
- SQL Server AlwaysOn Availability Groups provide us the capability to group two or more databases and perform a failover of both as a logical unit.
- A single primary replica, which hosts the primary read/write databases.
- One to eight secondary replicas, each of which hosts a set of secondary read-only databases and serves as a potential failover target for the availability group.
- An availability group supports a failover environment for a discrete set of user databases, known as availability databases, that fail over together as a unit. An availability group supports a set of primary databases and one to eight sets of corresponding secondary databases.
- It can have synchronous and Asynchronous Replicas, but it can have up to three synchronous Replicas only.
- An availability group fails over at the level of an availability replica. An availability replica provides redundancy only at the database level—for the set of databases in one availability group.
- The primary replica makes the primary databases available for read-write connections from clients, and additionally, sends transaction log records of each primary database (data synchronization) to every secondary database.
- Each secondary replica caches the transaction log records, (hardens the log) and then applies them to its corresponding secondary database. Data synchronization occurs between the primary database and each connected secondary database, independently of the other databases, therefore, a secondary database can be suspended or fail without affecting other secondary databases, and a primary database can be suspended or fail without affecting other primary databases.
Data Synchronization Options.
- Synchronous data movement guarantees that if a commit confirmation has been received by the client, the log record has been hardened to the primary replica as well as the secondary replica. This means the secondary has 100% of the data that the primary has. If the secondary needed to take over, it has all the needed data. This gives us the capability of zero data loss and comes with little overhead of increased transaction latency. This also gives the option of automatic failover to the secondary replica because the data is already present
- Asynchronous data movement doesn’t guarantee zero-data loss because log records for a transaction might arrive a bit later. In this case, the primary replica commits the transaction after hardening into the local log file, however, it does not wait for the hardening into the secondary replica. The secondary is hardened asynchronously. An advantage of this configuration is that the primary replica does not have to wait for confirmation from an asynchronous secondary, although the disadvantage is that the secondary replica can be lagging behind the primary.
How do Availability Groups Work?
Synchronous and asynchronous commit between Primary replicas and Secondaries are depicted in this diagram.
Kubernetes and SQL server Availability Group Architecture
SQL Server availability groups are implemented thru Kubernetes replicaSets. So in case of Primary / Secondary replica failures, Kubernetes can create the POD or Node again and the service can work without interruption.
Why it is important to use Kubernetes for SQL Always on Availability?
- No need to set up windows failover clustering or Linux Pacemaker Cluster.
- Kubernetes provides automatic failover capability.
- We can create up to 17 distributed replicas.
Kubernetes configuration.
Here is the configuration of Primary Replica and Persistent volume claim.
Here is the CLI code to make it work.
#Login and Create Resource Group az login az group create --name RG-SQLAG --location eastus #Create AKS cluster az aks create --resource-group RG-SQLAG --name SQLAG --node-count 3 \ --generate-ssh-keys --node-vm-size=Standard_B2ms # Get credentials for the cluster so we can connect az aks get-credentials --resource-group RG-SQLAG --name SQLAG # List nodes kubectl get nodes # List Pods kubectl get pods # Create external storage with PV and PVC kubectl apply -f sqlAGstorage.yaml # Display the persistent volume and claim kubectl get pv kubectl get pvc # Use K8 secrets to hold required sa password for SQL Server container kubectl create secret generic mssql-secret \ --from-literal=SA_PASSWORD="Welcome@0001234567" # Deploy the SQL Server 2019 container # Primary Replica Deployment kubectl apply -f sqlAGPdeployment.yaml --record # Secondary Replica1 Deployment kubectl apply -f sqlAGS1deployment.yaml --record # Secondary Replica2 Deployment kubectl apply -f sqlAGS2deployment.yaml --record # Get the status of the nodes, pods and service kubectl get all # Retrieve external IP address of the primary and secondary replicas ip1=$(kubectl get services | grep mssql-primary | cut -c49-63) ip2=$(kubectl get services | grep mssql-secondary1 | cut -c49-63) ip3=$(kubectl get services | grep mssql-secondary2 | cut -c49-63) echo IP addresses of the replicas $ip1, $ip2, $ip3 #Create Primary & secondary replicas from SQL script and create certificates in #Primary replica # Now Copy the certificates from the primary to the local # First Retrieve pod name to variable podagp=$(kubectl get pods -l app=mssql-primary -o custom-columns=:metadata.name) podags1=$(kubectl get pods -l app=mssql-secondary1 -o custom-columns=:metadata.name) podags2=$(kubectl get pods -l app=mssql-secondary2 -o custom-columns=:metadata.name) #prepare variables PathToCopyCert=${podagp}":var/opt/mssql/ag_certificate.cert" PathToCopyCertKey=${podagp}":var/opt/mssql/ag_certificate.key" # First copy to local kubectl cp $podagp:var/opt/mssql/ag_certificate.cert ag_certificate.cert kubectl cp $podagp:var/opt/mssql/ag_certificate.key ag_certificate.key # Copy the certificate from local host to secondary1 echo Copying AG certificates from localhost to pod $podags1 kubectl cp ag_certificate.cert $podags1:var/opt/mssql kubectl cp ag_certificate.key $podags1:var/opt/mssql # Next copy to secondary2 kubectl cp ag_certificate.cert $podags2:var/opt/mssql kubectl cp ag_certificate.key $podags2:var/opt/mssql # Now start the testing if SQL Always on Availabilty by causing failover podname=$(kubectl get pods -l app=mssql-primary -o custom-columns=:metadata.name) echo $podname kubectl delete pod $podname sleep 3 kubectl get pods # Some Important troubleshooting commands #Get the current status of the deployment kubectl get all #Get the current resource usage kubectl top nodes kubectl top pods #Retrieve pod IP addresses kubectl get services #Retrieving YAML declarations kubectl get <pod_name> -o yaml # Dump POD info kubectl describe pods mssqlag-secondary1-deployment-c7559dd7d-w42lz #Describe a deployment kubectl describe deployments #Describe a pod kubectl describe pod -1 app=mssql-primary #Describe the storage kubectl describe pv kubectl describe pvc # Display the Kubernetes Logs kubectl logs -l app=mssql # Display the logs from a previous run# Use the -p flag to look at the logs #from containers that have exited # Display the all events sorted kubectl get events --sort-by=.metadata.creationTimestamp # Display pod events kubectl get events --field-selector involvedObject.kind=Pod # Delete the deployment kubectl delete -f sqlAGPdeployment.yaml # Getting Kubernetes secrets kubectl get secret mssql-secret -o jsonpath="{.data.SA_PASSWORD}" | base64 --decode # Get pod name podname=$(kubectl get pods -l app=mssql-primary -o custom-columns=:metadata.name) # Running individual commands in the pod’s container kubectl exec -it $podname--top # Opening an interactive shell in the pod’s container kubectl exec -it $podname--/bin/bash #Run command on POD directly without login kubectl exec -it $podname -- /opt/mssql-tools/bin/sqlcmd -S. -U sa -P SQL2019AGPassword -Q 'select @@version' # Browse AKS dashboard az aks browse --resource-group RG-SQLAG --name SQLAG
You can find the entire code in my Github repo: Live Network /SQL-Always-on-AKS (github.com).
Hope you enjoyed this blog!!