Archived post. New comments cannot be posted and votes cannot be cast.
The "Kubernetes is weak at persistence" is something of a myth, or is outdated information at best. Not sure where that got started.
Kubernetes has extensive support for persistent volumes. Either use a static volume (create it manually and tell where to mount it fro), or a persistent volume claim (Kubernetes manages it for you).
Running a database under Kubernetes isn't particularly problematic, but like anything else, you have to be aware of all the failure scenarios. While Kubernetes is great at simplifying operations, it doesn't actually eliminate any of the foundational, operational domain knowledge needed (Linux, database, cloud); it adds one additional domain needed to operate it (i.e., Kubernetes itself).
In other words, you need to know what you're doing. But the failure scenarios aren't that different, in particular when it comes to HA (i.e. automated failover). If you're currently running a single cloud VM with MySQL, for example, your HA story (i.e. no HA) is exactly the same on Kubernetes, and it's arguably easier to get HA up and running.
persistent data seems to be a weak point with Kubernetes
I have to disagree there with that. Kubernetes has plenty of support for persistent data. Whether or not you should be running your persistent data service in Kubernetes is a different question.
Could you get postgres/mysql/mongo/whatever working in k8s? Absolutely.
Should you? Maybe. Try it out. Both ways. Then make your decision.
This has nothing to do with a weak point in k8s. This has everything to do with wrestling with distributed computed, k8s or otherwise.
Kubernetes is good for running DBs, most of the task are similar/equal if you run in k8s or outside (baremetal, vm etc..)
But for Storage you have a couple extra options which you can use (in that order):
Cloud Provider "SAN" (EBS & Co). Kubernetes can automatically create, attach, de/re-attach volumes to your node. depending where your Pod is scheduled.
2. Self Replication with local Volumes. Many DBs have some kind of replication. You pin your Pods to specific host nodes with local volumes. Can deliver great performance, especially in the BigData area. But you loose some flexibility of Kubernetes scheduling...
3. NAS (NFS, EFS etc). While many people have a vague aversion against nas, but for small-medium dbs it is a cost effective ($ and dev time) solution with well understood implications. Also you get full flexibility in k8s scheduling.
4. Solutions like GlusterFS/Rook etc. They are great but come are complex services which are hard to operate.
have you taken a look at Rook yet? it integrates natively into Kubernetes to provide persistent storage in any environment (bare metal, cloud providers, etc.)
As mentioned by others, persistence should no longer be something to worry about. In our experience running both Redis and MongoDB StatefulSets , majority of the issues we have had were around the dynamic networking of Kubernetes and the databases resiliency to this. Much of this was our inexperience as well. The way we had traditionally ran mongo replica sets needed some tuning to be more resilient to pod failures, network partitions, etc. Again this is probably also heavily dependent on the database and topology that you are aiming towards.
Take a look at StorageOS. It's designed for running highly available databases in Docker/Kubernetes, and it doesn't place restrictions on where pods need to be scheduled.
The "Kubernetes is weak at persistence" is something of a myth, or is outdated information at best. Not sure where that got started.
Kubernetes has extensive support for persistent volumes. Either use a static volume (create it manually and tell where to mount it fro), or a persistent volume claim (Kubernetes manages it for you).
Running a database under Kubernetes isn't particularly problematic, but like anything else, you have to be aware of all the failure scenarios. While Kubernetes is great at simplifying operations, it doesn't actually eliminate any of the foundational, operational domain knowledge needed (Linux, database, cloud); it adds one additional domain needed to operate it (i.e., Kubernetes itself).
In other words, you need to know what you're doing. But the failure scenarios aren't that different, in particular when it comes to HA (i.e. automated failover). If you're currently running a single cloud VM with MySQL, for example, your HA story (i.e. no HA) is exactly the same on Kubernetes, and it's arguably easier to get HA up and running.
More replies
one option: run it outside k8s.
GKE makes it easy to talk to Google Cloud SQL, which runs outside of the GKE
It is possible? Yes, read up on statefulsets, taints, and node affinity. Is it advisable? Depends on your use case.
More replies
I have to disagree there with that. Kubernetes has plenty of support for persistent data. Whether or not you should be running your persistent data service in Kubernetes is a different question.
Could you get postgres/mysql/mongo/whatever working in k8s? Absolutely.
Should you? Maybe. Try it out. Both ways. Then make your decision.
This has nothing to do with a weak point in k8s. This has everything to do with wrestling with distributed computed, k8s or otherwise.
More replies
Kubernetes is good for running DBs, most of the task are similar/equal if you run in k8s or outside (baremetal, vm etc..)
But for Storage you have a couple extra options which you can use (in that order):
Cloud Provider "SAN" (EBS & Co). Kubernetes can automatically create, attach, de/re-attach volumes to your node. depending where your Pod is scheduled.
2. Self Replication with local Volumes. Many DBs have some kind of replication. You pin your Pods to specific host nodes with local volumes. Can deliver great performance, especially in the BigData area. But you loose some flexibility of Kubernetes scheduling...
3. NAS (NFS, EFS etc). While many people have a vague aversion against nas, but for small-medium dbs it is a cost effective ($ and dev time) solution with well understood implications. Also you get full flexibility in k8s scheduling.
4. Solutions like GlusterFS/Rook etc. They are great but come are complex services which are hard to operate.
have you taken a look at Rook yet? it integrates natively into Kubernetes to provide persistent storage in any environment (bare metal, cloud providers, etc.)
https://github.com/rook/rook
disclaimer: I'm a maintainer on that project :)
More replies
As mentioned by others, persistence should no longer be something to worry about. In our experience running both Redis and MongoDB StatefulSets , majority of the issues we have had were around the dynamic networking of Kubernetes and the databases resiliency to this. Much of this was our inexperience as well. The way we had traditionally ran mongo replica sets needed some tuning to be more resilient to pod failures, network partitions, etc. Again this is probably also heavily dependent on the database and topology that you are aiming towards.
Take a look at StorageOS. It's designed for running highly available databases in Docker/Kubernetes, and it doesn't place restrictions on where pods need to be scheduled.
Docs for Kubernetes
Disclaimer: I work on StorageOS.