No internet connection
  1. Home
  2. Development

K8s, Swarm, Traefik and Talkyard

By KajMagnus @KajMagnus2018-11-01 10:11:27.960Z

It'd be good if Talkyard was simple to configure together with Kubernetes and Docker Swarm. And Traefik or Caddyserver (proxy servers with automatic HTTPS).

Maybe these could be the first steps: (or "all" that's needed?)

  • Provide a sample Docker-Compose file with custom networking and volume containers, which one can copy and edit and append to one's own Docker-Compose or Swarm configuration.
  • This Docker-Compose sample file should assume that there's a reverse proxy somewhere (like, Traefik) that provides HTTPS. (Right?)
  • Figure out some way to take regular backups, although in Swarm and K8S, there's no Cron job on any host node that can do that.
  • Is there any good way to automatically upgrade the Talkyard images & container, when a new version is available?
  • Make it simpler to build Talkyard images — so people can fix Swarm / K8s related issues, and build and test, without running into confusion. (Here's a topic about that.)
  • 10 replies

There are 10 replies. Estimated reading time: 12 minutes

  1. G
    Jerry Koerkenmeier @gkoerk2018-11-01 10:54:49.769Z

    FULL DISCLOSURE: I run a Docker Swarm and would LOVE to get Talkyard working in a stack.

    Provide a sample Docker-Compose file with custom networking and volume containers, which one can copy and edit and append to one's own Docker-Compose or Swarm configuration.

    YES!

    This Docker-Compose sample file should assume that there's a reverse proxy somewhere (like, Traefik) that provides HTTPS. (Right?)

    YES!

    Figure out some way to take regular backups, although in Swarm and K8S, there's no Cron job on any host node that can do that.

    Yes. I have a very good way of handling this by merely adding a second, nearly identical database image (which will also contain client) to connect to the other DB in the stack and use the relevant backup command (mysqldump, pgdump, mongodump, etc.) that sleeps for a defined interval (between backups) and only retains backups for N days (which gives you full coverage as long as you backup the filesystem location where the backup files are stored.

    I'm not well suited to answer the last two points, but I think they do belong in the list.

    Is there any good way to automatically upgrade the Talkyard images & container, when a new version is available?

    Ideally I think this would be handled by an "automated build" in the docker repo which is triggered by changes in Github. Usually providing a new version of the images and updating the :latest tag to point to new images means all end users should need to do is a docker stack deploy to have the image updated. One key in making this simple is to expose all the static, required-for-restore data in a volume the user binds to either a docker named volume or (preferably to me) a location on the (shared) storage using a bind mount. This way, the container itself is ephemoral. Just stop the old image and run the new one.

    BTW, Here's a link on a possibly better and newer way to handle: Docler Deploy Webhook whereby you could provide instructions on automating new builds all the way into the swarm itself requiring no manual effort at all.

    1. KajMagnus @KajMagnus2018-11-05 09:42:22.396Z

      Ok :- ) I'll read about Swarm and services, start creating a sample Docker-Compose file, and, later when I've experimented a bit myself and know & understand all this a bit more, I'll reply to the things you wrote.

      1. GJerry Koerkenmeier @gkoerk2018-11-05 23:45:35.694Z

        Want an example of a swarm .yml file with networking and traefik labels?

        1. KajMagnus @KajMagnus2018-11-06 03:46:57.481Z

          Yes that'd be helpful

          1. GJerry Koerkenmeier @gkoerk2018-11-06 18:04:17.111Z2018-11-06 18:14:36.676Z

            First example is Nextcloud (nextcloud.yml) - Will install the first time with docker stack deploy nextcloud -c /[path/to/nextcloud.yml and bring you to an install screen.

            version: "3.0"
            
            services:
              
              nextcloud:
                image: nextcloud:latest
                env_file: /share/appdata/config/nextcloud/nextcloud.env
                networks:
                  - internal
                  - traefik_public
                depends_on: 
                  - db
                deploy:
                  labels:
                    - traefik.frontend.rule=Host:nextcloud.gkoerk.com
                    - traefik.docker.network=traefik_public
                    - traefik.port=80
                volumes:
                  - /share/appdata/nextcloud:/var/www/html
                  - /share/appdata/nextcloud/apps:/var/www/html/custom_apps
                  - /share/appdata/nextcloud/config:/var/www/html/config
                  - /share/appdata/nextcloud/data:/var/www/html/data
            
              db:
                image: mariadb:10
                env_file: /share/appdata/config/nextcloud/nextcloud.env
                networks:
                  - internal
                volumes:
                  - /share/runtime/nextcloud/db:/var/lib/mysql
            
              db-backup:
                image: mariadb:10
                env_file: /share/appdata/config/nextcloud/nextcloud-db-backup.env
                depends_on:
                  - db
                volumes:
                  - /share/appdata/nextcloud/database-dump:/dump
                  - /etc/localtime:/etc/localtime:ro
                entrypoint: |
                  bash -c 'bash -s <<EOF
                  trap "break;exit" SIGHUP SIGINT SIGTERM
                  sleep 2m
                  while /bin/true; do
                    mysqldump -h db --all-databases | gzip -c > /dump/dump_\`date +%d-%m-%Y"_"%H_%M_%S\`.sql.gz
                    (ls -t /dump/dump*.sql.gz|head -n $$BACKUP_NUM_KEEP;ls /dump/dump*.sql.gz)|sort|uniq -u|xargs rm -- {}
                    sleep $$BACKUP_FREQUENCY
                  done
                  EOF'
                networks:
                - internal
            
              redis:
                image: redis:alpine
                depends_on:
                  - nextcloud
                networks:
                  - internal
                volumes:
                  - /share/runtime/nextcloud/redis:/data
            
              solr:
                image: solr:6-alpine
                depends_on:
                  - nextcloud    
                networks:
                  - internal
                volumes:
                - /share/runtime/nextcloud/solr:/opt/solr/server/solr/mycores
                entrypoint:
                  - docker-entrypoint.sh
                  - solr-precreate
                  - nextant
            
              cron:
                image: nextcloud
                volumes:
                  - /share/appdata/nextcloud:/var/www/html
                depends_on:
                  - nextcloud
                user: www-data
                networks:
                  - internal
                entrypoint: |
                  bash -c 'bash -s <<EOF
                    trap "break;exit" SIGHUP SIGINT SIGTERM
                    while [ ! -f /var/www/html/config/config.php ]; do
                      sleep 1
                    done
                    while true; do
                      php -f /var/www/html/cron.php
                      sleep 15m
                    done
                  EOF'
            
            networks:
              traefik_public:
                external: true
              internal:
                driver: overlay
                ipam:
                  config:
                    - subnet: 172.16.254.0/24
            

            The definition of the "traefik_public" network is external and created via docker network create --driver=overlay --subnet=172.1.1.0/21 --attachable traefik_public

            You can see that enabling traefik (once it's already running) for these containers is as simple as giving them appropriate LABEL values for Traefik to interpret. NOTE - In regular docker-compose, the labels are applied at the same level as the networks:, environment:, etc. but is placed under the deploy: section.

            The key is that while all your individual containers can speak to one another easily (since they are on the same overlay network -- named "internal"), they cannot speak to traefik unless they are on the same network it is. So we've added significant security to our stack by keeping every other service shielded from direct Internet access.

            1. GJerry Koerkenmeier @gkoerk2018-11-06 23:40:00.069Z

              I can also share my Traefik config if you like.

              1. KajMagnus @KajMagnus2018-11-07 04:32:36.111Z

                Yes please

                1. GJerry Koerkenmeier @gkoerk2018-11-07 13:01:38.960Z

                  On it. By the way - does the talkyard-web docker image (which runs NGINX) configured to serve as an SSL termination point, or does it act as a passthrough proxy? I think that's my only issue right now. I have it running in a VM, and Traefik proxying for the VM, but then I get errors. Can I bypass NGINX and point Traefik directly to talkyard-app (maybe would need to expose a port)? If so, which port? Or else, how can I change the default NGINX config so that conf/sites-enabled-manual/talkyard-servers.conf will serve as a pass-through proxy only? I think I would need to change these settings, maybe some are in the image at /etc/nginx/*.conf?

                  server {
                    listen 80      backlog=8192;   # about backlog: see above [BACKLGSZ]
                    # Using ipv6 here, can prevent Nginx from starting, if the host OS has disabled ipv6,
                    # Nginx then won't start and says:
                    #    [emerg] socket() [::]:80 failed (97: Address family not supported by protocol)
                    #listen [::]:80 backlog=8192;
                  
                    server_name _;
                  
                    ## To redirect to HTTPS, comment out these includes, and comment in "return 302 ..." below.
                    include /etc/nginx/server-limits.conf;
                    include /etc/nginx/server-locations.conf;
                  
                  1. KajMagnus @KajMagnus2018-11-09 15:32:12.624Z

                    Nginx does some different things: Rate & bandwidth limiting. Caching and serving uploaded files & assets. Websocket / long polling.

                    And, optionally, terminates SSL/TLS — and I think I'd like to move the TLS things to Traefik (in a Docker container), mainly because Traefkik has automatic HTTPS. People who already have their own reverse proxy, could then comment out Talkyard's Traefik container, in Talkyard's docker-compose/stack.yml file.

                    I would expect removing Nginx to result in weird things. ... Hmm, what are the errors you encounter, with Nginx?

                    Which docker-compose.yml file did you base your installation on? The one in https://github.com/debiki/talkyard is intended for development and just for building prod images. Maybe I should document this better. There's another Compose-file intended for production, here: https://github.com/debiki/talkyard-prod-one/blob/master/docker-compose.yml.

                    (And I have in mind to add a docker-stack.yml file too, for people who want to use Swarm and maybe build their own images)

    2. Progress
    3. @KajMagnus marked this topic as Planned 2018-11-05 09:42:25.776Z.