No internet connection
  1. Home
  2. Ideas

Feature request: Search API

By Christian Scheuer @chrscheuer2020-04-06 10:40:38.800Z

I don't remember if we've discussed this elsewhere.
As I do remember, we talked about it quite a bit, but it's a very long time ago.

We're doing various integrations now, where we try to make it easier for users to find the forum and to use it more.
A great way to get them to do that, is if we can integrate forum search into our app and website.

The API should ideally be accessible publicly without authentication, but we can also live with it if it has to go through our server first (at least for a start - even though it will make things a bit slower for the user).

We could live with the API just returning the top 5 or top 10 results as a starting point.
It would be ideal if it returns:

  • Title
  • Short summary (possibly with emphasized keywords)
  • Link
  • 13 replies

There are 13 replies. Estimated reading time: 13 minutes

  1. KajMagnus @KajMagnus2020-04-20 03:43:29.932Z2020-04-20 03:55:10.522Z

    Good idea, what about this public API in the upcoming version:

    GET  http:// ty server /-/v0/search?q=UX+improvements
    

    and the response: (note: matching phrases are marked with the HTML <mark> tag, in htmlWithMarks: ... below)

    {
      "searchResults" : [ {
        "pageTitle" : "support-chat",
        "pageUrl" : "http://site-3.localhost/-31",
        "postHits" : [ {
          "isPageTitle" : false,
          "isPageBody" : false,
          "htmlWithMarks" : [ "Probably such an iframe could be a bit better looking and <mark>UX</mark> friendly (maybe clickable author names)" ]
        } ]
      }, {
        "pageTitle" : "Potential UX improvements",
        "pageUrl" : "http://site-3.localhost/-334",
        "postHits" : [ {
          "isPageTitle" : false,
          "isPageBody" : true,
          "htmlWithMarks" : [ "All of this is about trying to <mark>improve</mark> the forum so it doesn&#x27;t require so much interaction from us on", "Right now it isn&#x27;t an <mark>improvement</mark> with the draft UI interleaved.", "Generally, <mark>UX</mark> changes that are only half-done means we have to spend time with reporting feedback when", "helpful if there could be a more well-tested&#x2F;documented approach for TY to introduce changes to the <mark>UX</mark>" ]
        } ]
      ...
    

    The response, as Typescript interfaces:

    interface SearchResultsApiResponse {
      searchResults: PageAndHits[];
    }
    
    interface PageAndHits {
      pageTitle: string;
      pageUrl: string;
      postHits: PostHit[];
    }
    
    interface PostHit {
      isPageTitle?: boolean;
      isPageBody?: boolean;
      htmlWithMarks: string[];
    }
    

    isPageTitle can be good to know, because maybe you don't want to both show the title, and include a highlighted matching phrase from the title (because the the title text gets inclued twice).

    (What's a good thing to call the Original Post? Above, it's "Page body": isPageBody?: boolean. But maybe people confuse that with the <body> html tag? Maybe isOrigPost would be better? But what if it's not a forum post, but an article? What about isArticleText? But what if it's not an article, but a forum post? Hmm)

    If !isPageTitle && !isPageBody, then the post is a reply (to the orig post, or to someone else).

    Maybe some time later, there could be an isAcceptedSolution field too?

    1. CChristian Scheuer @chrscheuer2020-04-20 18:08:49.565Z

      Yay - looks great!

      I think we'd like to have the category path and the last modified date. By category path I mean for example Packages -> Soundminer (since we have subcategories). These paths should have some kind of ID with them as well.
      Would it make sense to have the username of the posts and/or pages that were hit? At least the author of the page I think would be good to have so we can show them with a little image.

      Wrt using GET and querystring, I'm thinking this would be the start of the API, but it would likely be something that we'd want to augment in the future.
      For example to add:

      • Search only in certain categories
      • Search only in certain tags
      • Potentially paging

      For these reasons, I feel like a POST with json could potentially be more flexible. I seriously hate URL serialization/deserialization haha, everybody always gets it wrong.

      We also need to think about if it returns only public material (I think it should by default)

      1. KajMagnus @KajMagnus2020-04-25 06:45:34.105Z

        category path and the last modified date

        Yes (and that'd be nice to include on Talkyard's own search results page too).

        the username of the posts and/or pages that were hit? [...] author of the page

        Yes

        so we can show them with a little image

        The person's avatar image?

        POST with json could potentially be more flexible

        I think so too — internally, Talkyard has both a GET API, so queries can be linked via a URL. And a POST API, for the reasons you mentioned. Now I changed the public API to POST. A basic version (without the things mentioned above unfortunately) will be included in the upcoming version.

        The API wants JSON that looks like: { searchQuery: { queryText: "..... " }, pretty?: bool }. If the queryText is like: " ... text text categories:category-url-slug,another-cat-slug" then only those categories will get searched.

        We can add a separate categoryRefs: ... field next to queryText later, and then you can refer to the categories via ext-id instead, so the search functionality won't break if you change their URL slugs.

        1. KajMagnus @KajMagnus2020-04-26 22:14:16.189Z

          @chrscheuer — I'm adding author names and avatar url, + category name and URL (not the complete category path yet though).

          Someone mentioned an API endpoint for listing popular pages in a category, (here)
          and I thought it'd be nice to implement both the search API, and that other list-things API,
          and see how they a bit can share code and Typescript interfaces, with author names etc included.

          1. CChristian Scheuer @chrscheuer2020-04-28 14:07:58.195Z

            Super cool. Let me know when it's up on either server so I can make some tests :)

            With regards to the tagging system, also let me know when/if you'd like to discuss it further. I think we may start implementing our own tagging system for now so we can get something up and running very quickly and then we can switch to the forum's system once it's ready.

            1. KajMagnus @KajMagnus2020-04-30 08:07:28.750Z

              tagging system, also let me know when/if you'd like to discuss it further. I think we may start implementing our own tagging system for now so we can get something up and running very quickly and then we can switch to the forum's system once it's ready

              I think the nearest weeks I won't have time to look into the tagging system. Probably I should do OpenID Connect first. — Also, maybe in a way it'd be good if you build your own tags? Then, you can tell me how to implement tags in Talkyard in a way that works for you (and you seem to have a slightly more advanced need for tags than most organizations (?), so, what works fine for you, would work fine for almost everyone I'd think).

              B.t.w. one thing: I think I'd like the unique identifier of a tag to be a numeric ID, but not the tag label. So one can rename a tag, without having to re-index all pages tagged with that tag. (In ElasticSearch, the page would be connected to that never changing tag numeric ID, no need to reindex the pages, if renaming a tag label — the ID didn't change)

              1. CChristian Scheuer @chrscheuer2020-04-30 10:40:00.465Z

                Completely agree. That's also why I just thought we could start on our own - it will be easier to show you what we want by having something that already works :)

              2. In reply tochrscheuer:
                KajMagnus @KajMagnus2020-04-30 07:09:59.122Z2020-04-30 07:19:28.020Z

                I've upgraded this server Ty .io — your server, Ty .net, in 2 days I'd think (that is, Saturday).

                Meanwhile — here's the modified Search API:

                https://github.com/debiki/talkyard/blob/40ff70deb434d16f5d833ae8005158f873671637/tests/e2e/pub-api.ts#L292

                (The changes: Search query field renamed from queryText to freetext. And the search results are in a thingsFound array, instead of searchResults, and postsHit is now postsFound. "Found" sounds more nice than "Hit" I think, some time later, when searching for people: ParticipantFound[] instead of ParticipantHit[].)

                (If you scroll up and look at type FindWhat = 'Pages' | 'Members' | ... and interface LookWhere { ..., then, Ignore the comment about ReferencedThings object — I forgot to delete that comment.)

                B.t.w. the only thing I've actually implemented this far, is:

                POST /-/v0/search  {
                  searchQuery: { freetext: "... search query ..." }
                }
                

                ( + a list query, for listing the most popular pages, in a specific category:

                /-/v0/list  {
                  listQuery: {
                    findWhat: 'Pages',
                    lookWhere: { inCategories: ['extid:the_categorys_ext_id'] },
                  }
                }
                

                )

                1. CChristian Scheuer @chrscheuer2020-04-30 10:38:42.543Z

                  This all looks brilliant - great with your ElasticSearch guides on compound queries as well!
                  Love the scrollCursor placeholder too.

                  1. CChristian Scheuer @chrscheuer2020-04-30 10:39:08.908Z

                    Does lookWhere.writtenBy accept ssoid user IDs?

                    1. KajMagnus @KajMagnus2020-05-02 18:08:00.815Z2020-05-02 18:15:34.228Z

                      accept ssoid user IDs?

                      Not yet, but yes, that's the idea: writtenBy: ['ssoid:...', 'username:...', 'username:could_be_a_group' ].

                      Sorry seems I won't upgrade the server until tomorrow

      2. Progress
        with doing this idea
      3. @KajMagnus marked this topic as Started 2020-04-20 03:49:59.044Z.