Making backups of large forums crashes Talkyard
/-/export-site-json on our forum, after several minutes of waiting, we receive this:
502 Bad Gateway [TyE502BGW] Talkyard's Nginx server cannot connect to Talkyard's application server. Is the application server not running, or is there some network error? Please check if the 'app' Docker container is running: 'docker-compose ps' If it's not running: - Start it: 'docker-compose start app'. Then wait a few seconds and reload this page. - Or login in a Bash shell: 'docker-compose run --rm --service-ports app bash' If it is running: - Check the logs: 'docker-compose logs' - Only Play's logs: 'docker-compose logs app' - Or jump into the container: 'docker-compose exec app bash'
It looks like the application server crashes, possibly due to being out of memory while building the json object for export.
Our forum is getting to a size where having periodic backups available would be really nice, but this endpoint has been crashing/not working for several months
- 7 replies
- CChristian Scheuer @chrscheuer
@KajMagnus did you see this? Now that we're getting webhooks and more API endpoints, I don't think we would need to use this as an API endpoint anymore, so the importance for us here is not to be able to dynamically fetch json backups, but rather that backups are produced in the first place, and available to download to our own servers.
We 99.9999% surely won't be needing those backups, but given the reliance our company is having on the forum content, it's becoming vital for us to be able to have an offline (or self-hosted) backup system of the forum. This simply has to do with business continuity - if anything should ever happen to TY or to you, we'd risk losing all of the hosted content - so we need some kind of backup solution where we'd have access to the content.
If making the json dynamically is hard to get working without crashing the app server (also at, say 5x the size of our current forum) - would it be easier if Talkyard produced daily backups as a cron job and uploaded them to a storage server? Either one that we host on GCP or hosted on TY's own AWS servers? Ideally for us we would be able to give you a bucket on GCP to upload to.
I didn't see this until now — I'll have a look. More RAM sounds like a good first thing to try — I'd think that'll solve the problem. I'll have a look later today.
I agree with everything you wrote.
would it be easier if Talkyard produced daily backups as a cron job and uploaded them to a storage server?
Seems like a good idea, yes. Currently such backups include all sites hosted by the same server — and doing small such backups, of each large individual site, one at a time, and shipping off-site to that organization's IT system, seems like a good idea.
Either one that we host on GCP
That'd be best, on GCP or wherever else. Maybe you could fetch the files via rsync or something (then not bound to GCP, and also easier to make it ransomware resistant, in that you could tell rsync not delete old files?).
give you a bucket on GCP to upload to
Hmm then maybe Ty could mount the bucket as a file system dir, so it'd be writeable by a plain cron job. Could look into the GCP docs about mounting a bucked from another GCP project.
Awesome! Yea a RAM upgrade could be a good thing to try as a first step just to get something working here and now. Thanks for considering this.
We're fine with scheduling our own jobs to download from your CDN/storage and put on our own GCP once per day. That would be very easy for us to set up, if we know where to download from of course. Whatever is the easiest for you to develop first, I would vote for :)
Ok :- )
Memory just upgraded, now twice as much, should last for ... quite a while
Let me know if the backups start working? & I'll check the logs too (there were out of memory errors in the logs, caused by
GET //forum.soundflow.org/-/export-site-jsonand me having configured too little memory)
- Progresswith handling this problem
- @KajMagnus closed this topic 2022-04-18 02:16:29.474Z.
- @KajMagnus reopened this topic 2022-04-18 02:16:44.884Z.
- @KajMagnus marked this topic as Done 2022-04-18 02:16:47.416Z.