No internet connection
  1. Home
  2. Development

Bleve - Alternative to Elastic?

By @stellarpower2019-08-04 04:23:29.678Z

Hi,

I was reading through some of the docs and was already aware of the runtime requirements of the JVM; I saw that Toshi was mentioned as an alternative to Elasticsearch which is using much of the resources, but I see it's not regarded as production-ready yet.

I did some googling and found something called Bleve - http://blevesearch.com/. The site seems a.bit out-of-date but there are comments and commits from the last few days and as I know nothing about Elastic was wondering if it was worth considering as an alternative to use. From what I see here
https://news.ycombinator.com/item?id=8279081
It seems to be aimed at replacing Lucene at the low-level, but offers a sufficient high-level API that it could feasibly be used instead of Elastic.

Think this is of any use?

Cheers :)

  • 4 replies
  1. KajMagnus @KajMagnus2019-08-05 14:19:34.767Z

    Hi @stellarpower Belve looks like an interesting alternative to ElasticSearch yes, which (I suppose) would use only a fraction of the memory and CPU that ES requires. At the same time, it's a library that (it seems to me) one calls from one's Go code. Rather than a stand alone search engine server that one calls via HTTP. And what I want for Talkyard, is a separate server (application process) that handles search, rather than a library. One reason is that otherwise it'd be a bit too simple to DoS attack Talkyard, by crafting "evil" search queries that use up all CPU and memory of the main Talkyard server itself (making Ty inaccessible to "everyone").

    Toshi, however, is similar to ElasticSearch in that it runs in its own separate process, which I can place in its own Docker image with memory and CPU restrictions. So if something goes wrong, it'll afffect the search module only, but not the whole of Talkyard. (Also, it's written in Rust :- ) I like Rust)

    Another maybe good alternative could be PostgreSQL's built in search b.t.w. (which is getting better and better each year :- )). Not sure how "easy" it could be to DoS the PostgreSQL database, by typing "evil" queries. Hmm.

    1. S@stellarpower2019-08-05 14:40:58.321Z

      Hey Kaj,

      Absolutely understand, that's a very sensible idea. I had a look at Toshi and seemed good, was just concerned about the open admission in the readme that it's not production-quality yet. Think the lower-level library is stable enough but Toshi itself apparently isn't yet. If I could put Bleve into a container as a server and mould the API sufficiently so that it (let' s just assume hypotheticaly for a second) could be dropped in without any modification, would this be of interest at all?

      I'm interested in talkyard for my site, and just generally, it's a fantastic piece of software that does nearly everything I could want, and you've clearly designed and documented it exceptionally well for a FOSS product, but I'm on a budget and have been looking recently into serverless options where there is a generous free tier but resource usage is limited, so I'd be interested into whether the footprint can be reduced and further if e.g. Postgres (and Redis?) could be separated and hosted by a cloud database server to take advantage of the free tier. I know you offer served options but was a little confused by the pricing (but I'll keep this thread on-topic and ask later)

      If there are any other things that I could help with or that could do with being completed, please let me know and I'll try :) I love how talkyard is just packaged as a docker image and I can just run with it. Makes life much easier. Unfortunately however, I think the downside to this is that if I can't provide a setup in Bionic with the requirements, it can't be run so easily. So I'd be interested, time permitting, in looking into whether it can be a bit more modular - your design already is, but whether one could allow components to be swapped out for alternatives where the default setup is impossible.

      Cheers

      Ben

      1. KajMagnus @KajMagnus2019-08-06 14:15:43.823Z

        Ok yes I too think Toshi seems like too high a risk, currently.

        If I could put Bleve into a container as a server and mould the API sufficiently so that it (let' s just assume hypotheticaly for a second) could be dropped in without any modification, would this be of interest at all?

        Hmm. The nearest time (like, 10 weeks) I'm afraid I'm too short of time to have a look at the result. After that, ... Probably I'd want to wait until Toshi is more stable, and then compare Toshi, with this Bleve + API server you might create, with built-in PostgreSQL search, with ElasticSearch.

        Another thing you could maybe do, is to try to integrate Toshi with Talkyard, and join the Toshi project and make it production quality sooner? Maybe by contributing automatic tests, to the Toshi project? — b.t.w. what's your overall project time frame?

        (I'm getting the interest you're familiar with Golang? Not as much with Rust? What's your technical background if I may ask :- ))

        Actually, looking at the activity in Toshi: https://github.com/toshi-search/Toshi/graphs/contributors
        And Tantivity (comparable to Bleve, written in Rust): https://github.com/tantivy-search/tantivy/graphs/contributors
        And Bleve: https://github.com/blevesearch/bleve/graphs/contributors

        ... then I'm actually thinking that Tantivity and Toshi are more "alive" than Bleve. The Bleve original author, isn't contributing that much to Bleve any longer, whereas the creators of Tantivity + Toshi are both of them still active in their projects.

        have been looking recently into serverless options where there is a generous free tier but resource usage is limited, so I'd be interested into whether the footprint can be reduced and further if e.g. Postgres (and Redis?) could be separated and hosted by a cloud database server to take advantage of the free tier.

        I'd love to hear about your use case? For what do you have in mind to use Talkyard?
        And what is affordable, from your perspective and for your use case?
        Is it ok if I ask in which country do you live?

        footprint can be reduced and further

        I want Talkyard to run in a Raspberry Pi with 600 MB RAM :- ) not the nearest months, but ... eventually.

        I know you offer served options but was a little confused by the pricing (but I'll keep this thread on-topic and ask later)

        Sorry about that. Please ask and I'll try to clarify?
        Maybe I can change the pricing pages too so they become less confusing.
        (If you open-source self host, $10 / month for a 2 GB DigitalOcean VPS should work fine.)

    2. Progress
    3. @KajMagnus closed this topic 2019-08-05 14:19:45.392Z.