No internet connection
  1. Home
  2. Ideas

Talkyard API: Upserting categories

By KajMagnus @KajMagnus2019-06-14 10:38:22.764Z2019-06-14 10:48:35.808Z

So there's going to be an endpoint for upserting categories. This is what I have in mind: (feedback welcome :- ))

POST server-address/-/v0/upsert-simple

and that endpoint accepts a SimpleUpsertV0 object that looks like this, in Typescript: (and ? means optional )

interface SimpleUpsertV0 {
  categories?: SimpleCategoryV0[];

  // ... more things, later, if you want to upsert many things
  // in the same database transaction.
}


interface SimpleCategoryV0 {
  id: CategoryId;      // should be  2 000 000 001
  extImpId?: string;   // your own external unique id for the category
  parentId?: CategoryId;         // later, skip for now
  defaultSubCatId?: CategoryId;  // later, skip for now
  name: string;
  slug: string;
  description: string;
  position?: number;
  defaultTopicType?: PageType;
};

The extImpId is your own unique id for the category. When Talkyard (Ty) upserts the category, Ty first tries to look up any existing category with this "external import id", and, if found, Ty updates that category. If not found, Ty creates a new category, assigns it an id (less than 2e9) and remembers its extImpId.

The id field should be any number > 2e9, e.g. 2 000 000 001. Numbers > 2e9 are reserved for things you upsert into Ty and that you don't know if they're present in Ty's database (and you don't want to write extra code to find out).

You can set id to any 32 bit signed integer > 2e9. If you upsert more things, like pages and sub categories, in a single HTTP request — then, some of the other items you upsert, can refer to the category.id and then Talkyard knows you want to place them in that category.

(B.t.w. I'm including the word "simple" in the URL and Typescript interface, because under the hood, more things happen, that you don't need to know about: When you create a category, then, also, a category description page gets created, and a Post that stores the category description text and edit revisions, if you edit the description later. — Probably there'll also be a "complicated" endpoint /-/v0/upsert-dump that specifies exactly what things should be upserted and doesn't do anything for you automatically / implicitly)

@chrscheuer what do you think? Something that can be done in a better way, or maybe re-thinking the whole approach?

  • 75 replies

There are 75 replies. Estimated reading time: 91 minutes

  1. KajMagnus @KajMagnus2019-07-01 09:23:19.323Z

    @chrscheuer

    I think you'll need to be able to assign an "External Import IDs" of your choosing, to the parent Packages category, so you have an id to refer to, when upserting child categories for individual packages.

    Another approach would be that you refer to the Packages category, via Talkyard's internal category id, but not impossible that exposing such internal ids will cause problems in the future, if I need to change the ids somehow, or if they get remapped to different numbers during an export and import.

    I'm thinking it'd be more future proof, if you instead give the Packages category an extImpId, say, "soundflow_packages", to refer to when upserting child categories.

    To upsert a sub category, you'd then do this:

    POST server-address/-/v0/upsert-simple
    
    {
      categories: [{
        // This is the Packages category. It's in the database already; this entry is here only so you'll
        // have a "temporary import id" to refer to ---.
        extImpId: 'soundflow_packages',                |
        id: 2000000001,                   <------------+
      } {                                              |
        extImpId: 'someones_new_package_name',         |
        id: 2000000002,                                |
        parentId: 2000000001,          <---------------`
        name: "New Package Name",
        slug: "new_package_slug',
        description ....
      }]
    }
    

    Talkyard will then lookup extImpId = 'soundflow_packages' in the database to find the real internal id X of the Packages category, and then, when saving the New Package, Talkyard sets its parentId not to 2000000001 but to X.

    Another alternative could be:

    POST server-address/-/v0/upsert-simple
    
    {
      categories: [{
        extImpId: 'someones_new_package_name',
        id: 2000000001,
        parentExtImpId: 'soundflow_packages,    <———
        name: "New Package Name",
        slug: "new_package_slug',
        description ....
      }]
    }
    

    What do you think?

    1. CChristian Scheuer @chrscheuer2019-07-01 09:26:12.295Z

      This sounds great!

      1. CChristian Scheuer @chrscheuer2019-07-01 09:28:15.826Z

        As a general scheme, I don't know if you need to distinguish between "pure-reference entities" and "entities for upserting". Right now you seem to make the distinction based on the name/slug/description parameters being present, but imagine that all those parameters were optional (for something other than categories)..
        Idk. It's a hypothetical and not a problem for now. Problem / not a problem?

        1. CChristian Scheuer @chrscheuer2019-07-01 09:30:35.943Z

          A different way would be to have id and parentId be strings, that could be of the form:

          parentId: "talkyard:53"
          parentId: "ext:soundflow_packages"
          

          By using a scheme like this instead of magic numbers it's more manageable for any future additions.

          1. CChristian Scheuer @chrscheuer2019-07-01 09:32:08.046Z

            This doesn't change anything about the fact that the underlying id's are numerical. It's just making a more concise interface. Note this also covers the fact where you need to upsert more things in the same batch and have them relate to each other, since you'd be using the "ext:xxx" form then.
            It could also support pure numerical input, in which case it would assume it was the talkyard internal id.

            1. CChristian Scheuer @chrscheuer2019-07-01 09:35:43.487Z

              Sorry for spamming. But the upside of having a unified simple string-based format for referencing entities is this can help in other ways too, for example in quick-access URLs it would be easier than having to present both entities.
              Also, the magic numbers system has the drawback that the user of the API needs to build a separate map for magic-numbers to ext-import-ids, that they don't have to with the string based format.
              It's inspired by the API auth token btw.

              1. KajMagnus @KajMagnus2019-07-11 06:06:12.468Z2019-07-11 06:28:24.064Z

                distinguish between "pure-reference entities" and "entities for upserting". Right now you seem to make the distinction based on the name/slug/description parameters being present, but imagine that all those parameters were optional (for something other than categories)

                That's a good point. For now, maybe we can do like this:

                {  // a category to upsert
                   extId: 'someones_new_package_name',
                   parentExtId: 'soundflow_packages,    <———
                   name: "New Package Name",
                   slug: "new_package_slug',
                   description ....
                }
                

                and then we won't need to think about that possible problem, for now.

                A different way would be to have id and parentId be strings

                Hmm. Sometimes both an internal Talkyard id, and an external id, will get imported at the same time. So I think there needs to be two separate fields, for the internal and external ids? This would happen if you export your Talkyard site from talkyard.net to a dump file, and then import to your own server. Then you'd want to reuse the same page and post internal ids (page ids typically appear in the urls) — and also import the same external ids. So, two fields, for each item that gets imported. And one would be the numeric internal id, and the other would be an external id string (no "ext:" prefix needed).

                unified simple string-based format for referencing entities is this can help in other ways too, for example in quick-access URLs

                That sounds as if could be nice. You have in mind that it'd get used e.g. like this?:

                /-/v0/list-topics-in-category?categoryId=ext:external_category_id
                /-/v0/list-topics-by-user?userId=extSsoId:user_external_single_sign_on_id
                /-/v0/list-topics-by-user?userId=extId:user_external_id
                

                Or this would work equally well? and might be slightly simpler to implement, hmm:

                /-/v0/list-topics-in-category?categoryExtId=external_cat_id
                /-/v0/list-topics-by-user?userExtSsoId=user_external_single_sign_on_id
                /-/v0/list-topics-by-user?userExtId=user_external_id
                

                the magic numbers system has the drawback that the user of the API needs to build a separate map for magic-numbers to ext-import-ids, that they don't have to with the string based format

                That's a good point. I just did that, for a Disqus comments importer, and indeed this was a bit extra work. It'd be nice to re-think this later and make the dump files work with purely external ids. — For now, the quickest and simplest way to get something working, was actually to use these magic 2e9 + 1 numbers. This is partly related to me using Scala server side, statically typed, and creating new classes with string fields, or changing the ids from int to string, seemed like lots of work.

                ... Some time later I think I'll need to create "Patch" classes that are effectively copies of [the other classes representing things in the database], but with all fields optional. And with string extId:s and parentExtIds etc. Then, by specifying only some of these optional fields, and upserting something, one can choose exactly what to change, in the database. E.g. upserting a CategoryPatch, and specifying an extid and a new category slug, but no other fields —> the slug changes, but description and name and everything else left intact.

                1. CChristian Scheuer @chrscheuer2019-07-11 08:13:55.270Z

                  Yea this is good progress on the issue. I'm a bit tired when reading this so not entirely sure I get everything 100%. Will re-read later. First thoughts are:

                  That's a good point. For now, maybe we can do like this:

                  { // a category to upsert
                  extId: 'someones_new_package_name',
                  parentExtId: 'soundflow_packages, <———
                  name: "New Package Name",
                  slug: "new_package_slug',
                  description ....
                  }

                  Yes this is an improvement to not have to deal with the extra pure-reference entity and magic number mappings.

                  However the "parentExtId" field doesn't solve the case where you want to reference a parent that has an existing internal talkyard ID. This would only work for referencing parents that were associated with external IDs. Maybe you want to enforce that entities that need mapping always need external IDs? Or maybe I would just need to put a different field in there if I was trying to refer to it via the parent's TY id...

                  Hmm. Sometimes both an internal Talkyard id, and an external id, will get imported at the same time. So I think there needs to be two separate fields, for the internal and external ids?

                  I agree for primary keys yes. I was only referring to using the string based approach when referencing another entity. Not for any entity's own primary keys.
                  So an entity that you send would be for example

                  {
                      categoryId: 235238458,
                      parentId: 239582 | 'ty:239582' | 'ext:lakejrha:Aerlhkjaer:ekljfdlh',
                      ....
                  }
                  

                  My point being here, you don't want conflicting references to parents. You just want 1 reference. The parent itself will have both a TY ID and a EXT ID (potentially). The reference will be to either of those two.
                  If you allow both a parentExtId and a parentId field they could be in disagreement and that means we then need to figure out who is more in charge.

                  In the roundtrip case we'd have a pure number / TY-ID reference since that's what we received in the dump. If we want to get the parent's EXT ID in the dumped version we could look it up by resolving the parentId ourselves.

                  No matter what, the problem we're trying to solve is purely how you want to reference other entities within something that we send TO the API. I think the API should always return TY's own IDs..
                  Hmm.. Or, we think of the ID systems as 2 entirely separate systems, meaning you'd always return both a parentId and a parentExtId, just for making it easier for the API consumer... I'm just not sure if this makes it easier or if it introduces "who's in charge" problems as described above...
                  At this point maybe we're overthinking this a tiny bit. But it's an interesting discussion :)

                  That sounds as if could be nice. You have in mind that it'd get used e.g. like this?:
                  ...
                  Or this would work equally well? and might be slightly simpler to implement, hmm:
                  ...

                  Gotcha. Yea those would both work of course. I'll leave it up to you as to what makes best sense from a data perspective. Implementation wise it's just a string split so I wouldn't let that be the determining factor. I would look at if it makes sense to have a unified way across the whole system to do ID references of entities #1, or if you want to define external ID associations more manually (distinct property naming per entity) as in #2. If I were designing a schema that was to be automating syncing between 2 systems I would like this to fairly standardised so you don't end up with slightly differing field names etc (can grow ugly over time).

                  1. KajMagnus @KajMagnus2019-07-13 16:56:03.895Z

                    However the "parentExtId" field doesn't solve the case where you want to reference a parent that has an existing internal talkyard ID. This would only work for referencing parents that were associated with external IDs. Maybe you want to enforce that entities that need mapping always need external IDs? Or maybe I would just need to put a different field in there if I was trying to refer to it via the parent's TY id

                    I had in mind that one could choose between using 1) parentExtId: "<external_id>" or 2) parentId: <talkyard_internal_int_id>. And if one specified both, that'd work fine, as long as both ids pointed to the same parent — otherwise, the server would reply Error.

                    I was only referring to using the string based approach when referencing another entity. Not for any entity's own primary keys.

                    Ok

                    don't want conflicting references to parents. You just want 1 reference

                    I agree; just 1 ref is simpler

                    Now I'm thinking there can instead be a parentRef (note: Ref ) field, to reference the parent. It can be a string, e.g. "extid:soundflow_packages" or "tyid:1234". And in a url:

                    https://server/-/v0/list-categories?inParentCategory=extid:soundflow_packages
                    

                    (and, like you wrote above, if there's no type: prefix, then the value is interpreted as a Takyard internal id.)

                    I'm thinking using another name for these reference values, makes it easier to talk about. When one says "partenId" one always means Talkyard's internal ids. And "parentRef" or "whateverRef" menas these "extensible" text values with a type: prefix.

                    (And one typically wouldn't include parent ids, when upserting something. Only a ref to the parent. And 1) if one includes an id anyway, it needs to point to the same thing, as the parent ref does. And 2) when one exports a Talkyard site to a dump file, only the parentIds would be included in the exported json.)

                    1. CChristian Scheuer @chrscheuer2019-07-14 16:38:45.799Z

                      parentRef is a great idea, to distinguish clearly between ids and references.

                      Now I'm thinking there can instead be a parentRef (note: Ref ) field, to reference the parent. It can be a string, e.g. "extid:soundflow_packages" or "tyid:1234". And in a url:
                      https://server/-/v0/list-categories?inParentCategory=extid:soundflow_packages
                      (and, like you wrote above, if there's no type: prefix, then the value is interpreted as a Takyard internal id.)

                      Yes I completely agree.

                      1. CChristian Scheuer @chrscheuer2019-07-14 18:29:33.902Z

                        The distinction between refs and IDs also means there's a clear distinction between API input and output. Input would use refs for refs to parents, groups etc, whereas API output would be IDs (ie. pure talkyard integers) - for example if you're listing categories, posts etc.
                        This makes sense to me.

    2. In reply toKajMagnus:
      KajMagnus @KajMagnus2019-07-14 02:15:14.024Z

      @chrscheuer What about permissions? The upserted Soundflow package categories, are they to be publicly visible, or visible only to community members?

      Maybe the JSON can be like: (note the permissions: array at the end, and the ...Ref: 'extid:...' references)

      POST server-address/-/v0/upsert-simple
      
      {
      categories: [{
         extId: 'a_package_name',
         parentRef: 'extid:soundflow_packages',
         name: 'Package Name',
         slug: 'package_slug',
         description: '...',
         permissions: [{
           forMemberRef: 'extid:all_members',
           maySee: true,
           mayCreatePages: true,
           mayPostReplies: true,
           mayEditOwn: true,
         }, {
           forMemberRef: 'extid:staff',   // admins and moderators
           maySee: true,
           mayCreatePages: true,
           mayPostReplies: true,
           mayEditAll: true,     // for staff only
           mayDeleteAll: true,   // for staff only
         }]
      }]
      }
      1. CChristian Scheuer @chrscheuer2019-07-14 16:43:23.681Z

        This looks great.

        Our auto-created packages should be visible to anybody for now, but I agree it makes sense to touch on the permissions in the upsert API schema. Maybe it can default to "normal public category" if no permissions are specified, just for ease of use?

        We're currently using "Trusted member" to give our alpha/beta team access to restricted categories that are only meant for them. Ideally whichever solution is made for the permissions schema it will be forwards compatible with if you will make access groups a real feature with more granular control some time in the future. I'm thinking the forMemberRef is referring to groups, so this sounds like you already have it covered.

        1. CChristian Scheuer @chrscheuer2019-07-14 16:45:43.980Z

          Oh wow. Now that I took a look, I see there's now a Custom Groups section!! That's epic!! When did this come? :)
          You should do a newsletter or something where we can follow the progress haha so we don't miss new awesome features

          1. KajMagnus @KajMagnus2019-07-22 11:06:52.746Z

            Ok yes a newsletter is a good idea :- ) Maybe I can create a Newsletter category here in the forum? I'll subscribe you to that Newsletter category then.

            The custom groups thing appeared about 2 months ago, and ... hmm I wasn't good at informing people about that. Maybe I should think a bit about who might want to know about that, and contact them and let them know. I can do shortly after the next release :- ) and then I can mention import-export one's site too.

            1. CChristian Scheuer @chrscheuer2019-07-22 11:12:37.896Z

              Newsletter category sounds good to me. Strictly with updates about new features or changes.
              That way we can see the progress on a timeline.
              You could also automate publishing release notes from git commits maybe? I would like to at one point create automated "release pages" in TY whenever we release a new internal build. Right now I only do it once every 20 builds or so, since it's tedious to do. If it's fully automated, stuff gets a little easier :)

              1. KajMagnus @KajMagnus2019-07-22 11:56:59.852Z

                Release notes from Git commits — how would that work? Maybe there could be an API endpoint to which one can POST the output from sth like git log -n123. Talkyard then creates a topic that summarizes the changes in the git log? And one could connect GitHub etc with this Talkyard endpoint, via sth like Zapier? Maybe trigger a GitHub —> Talkyard request, whenever a GitHub milestone gets completed? Or weekly? Once per commit would be annoyingly often I suppose.

                Feel free to create an Idea topic about this if you want to :- )
                Probably won't have time to look into this the nearest time, still, would be nice to remember the idea.

                1. CChristian Scheuer @chrscheuer2019-07-25 00:15:17.098Z

                  Hi @KajMagnus.
                  Sorry I wrote a long reply but my phone browser crashed.
                  Yea it's something like that that would be nice. I think what we're looking for is to start with just an API endpoint like the category upsert but for posts CRUD. Conversion / formatting from git log to markdown format should be the API consumer's responsibility IMO. But yea agree that's down the way in any case.

            2. In reply tochrscheuer:
              CChristian Scheuer @chrscheuer2019-07-14 18:13:58.575Z

              One thing to note. If we go with the "extid:blabla" and "tyid:blabla" string ref thing. Make sure that it parses stuff like "extid:blabla:blabla:blabla" correctly, that is when the external ID itself has colons (since ours do). This essentially makes it a URN with a custom scheme name.

              1. KajMagnus @KajMagnus2019-07-22 11:15:41.331Z

                Ok. B.t.w. I'm thinking maybe it'd be good if the ext id can be a URL. So maybe I can allow everything that's ok in a URL.

                1. CChristian Scheuer @chrscheuer2019-07-25 00:23:37.861Z

                  Sure, sounds good. As long as it's not limited to urls. Ours would not be valid URLs in the http sense (we have two colons) and would feel awkward to have to convert them (big potential source of bugs).
                  I would expect it to allow any string (probably UTF-8 encoded) after an initial ext: up to a certain character limit (I think you mentioned 100).

                2. In reply tochrscheuer:
                  CChristian Scheuer @chrscheuer2019-07-14 18:45:13.058Z2019-07-14 19:18:52.841Z

                  Another thing I'm thinking could be good for the API implementation, maybe not in version 1, but down the road, is to address the very very usual case where all we need is just a link to an existing category since it was already created. 99.9% of requests, even a higher percentage down the road, are going to be read-only requests, not needing to upsert anything.
                  It may be relevant to have a very speedy API that just gives back a link from an in-memory cache / redis based on the provided extId. This API would then fail in the case where the category hasn't been created.
                  Of course we can build this caching layer ourselves too, so it's not a high priority. Just something worth considering at one point. Maybe it's even better if we build this cache ourselves actually, since we'll know when the original data (name etc) changed, which could change the URL, so we'd render our cache invalid in those cases.

                  1. KajMagnus @KajMagnus2019-07-22 11:09:36.380Z

                    Good idea, I can add that. Probably I would want a caching layer, regardless of if you build a cache. Because other Talkyard API consumers might not build any cache, and ... then I'd like to have a cache, because of them.

                    1. CChristian Scheuer @chrscheuer2019-07-22 11:12:46.967Z

                      Haha I thought you'd say that :)

          2. C
            In reply toKajMagnus:
            Christian Scheuer @chrscheuer2019-06-14 12:44:34.620Z

            Wohoo! What a great Friday gift :)

            @chrscheuer what do you think? Something that can be done in a better way, or maybe re-thinking the whole approach?

            This definitely seems like a very good approach very close to what I was hoping for.
            I completely understand what you mean by "simple" and I agree with that concept. To me making the API surface smaller is only gonna help creating tests for it and ensuring that it works across all cases.

            Couple of things we should consider:

            • What does the endpoint return? Some would probably like the id, however we'd probably just want the url (value taken from the generated page's pagePath.. if I remember correctly). Maybe we need a SimpleCategoryUpsertResponseV0 interface that either responds with simple things like an url, and/or with all the generated entries from the various tables.

            • URLs matter. Right now all posts/threads/pages have id + slug based urls, where the slug can change without affecting the URL since the id stays the same. It appears categories don't have this (at least right now). In the SimpleCategoryV0 interface the slug is a required parameter. This leads me to think that by changing the slug we can (accidentally or because we want to) change the url of the category. We need to think about how this behavior should be. Our users can change the name of a package but it stays with the same internal ID (and thus internal url). We'd want the category url to stay the same too, across different slugs.
              Maybe you already have support for this, but it would make sense to me that these auto-generated categories would utilize the same url pattern as posts. Maybe a flag to set in the interface?

            • Now for the interesting part. In the future you'll want to support sub-categories. We'll need to think about how those urls should look. To the user, it would make sense that each subcategory appends itself to the url of its parent. However that doesn't seem consistent with the -uniqueid/slug pattern that I requested above. So this is something that needs to be considered.

            • Slug algorithm/restrictions. Which characters are allowed in the slug? Which length?

            • id magic numbers > 2e9. This sounds great and well thought out for more complex insertions.

            • Edit: added this: Are there any limitations for extId strings length wise? Our id's for packages are concatenated user id + package id + so ours are about 60 characters long...

            That's what I think for now. Thank you so much for looking into this :)

            1. CChristian Scheuer @chrscheuer2019-06-14 17:19:23.332Z

              Another way to handle category url changes would be to make sure to keep a "history" inside TY of previous urls, so that categories whose urls (slugs) change, would just have permanently occupied the old space that will then redirect to the new one.
              That would keep it possible to have easy-to-read nested category urls, like:
              http://forum.soundflow.org/latest/packages/my-package

              1. KajMagnus @KajMagnus2019-06-18 12:15:15.314Z
                • What does the endpoint return — yes seems like the category id and the slug should be included in the reply. Maybe a list of all things that got created, yes. Maybe you'd find the cateory and its id and slug like so responseJson.categories[0].slug and [0].id and .urlPath.

                • Category slugs and renaming categories: I like the approach with keeping a history, and having old (sub) category url paths rediect to the current up-to-date path, like /latest/packages/current-package-name. This is b.t.w. how discussion topic urls already work — if you change the url to a topic, the old url redirects to the new (should work also for pages for which the /-1234/ id number has been excluded in the url (there's an "advanced" setting for hiding the page id in the url))

                  I'm thinking people renaming one package to [the previous name of another package], is something you'll need to deal with in Soundflow? (mabe by not allowing such renames?) And it's okay if, in Talkyard, if a category Bbb gets renamed to [a previous name of another category Aaa], then Ty stops the old url path to Aaa from redirecting to the new path to Aaa, and instead that url path will be to Bbb (although it was previously used by Aaa).

                • "any limitations for extId strings length wise?" — I have in mind an external-import-id 100 char length restriction (i.e. > 60).

                • "Which characters are allowed in the slug" — I have in mind lowercase alpahumeric + hyphen - ok inside (but not starting or ending with -) and at least one letter (not only digits).

                1. CChristian Scheuer @chrscheuer2019-06-22 17:21:54.918Z

                  What does the endpoint return — yes seems like the category id and the slug should be included in the reply. Maybe a list of all things that got created, yes. Maybe you'd find the cateory and its id and slug like so responseJson.categories[0].slug and [0].id and .urlPath.

                  Sounds good to me - so basically close to a copy of the simple input with urlPath added...
                  It might make sense at one point to also have the complete raw entries present in the response, but for the current use cases that would just make the response larger and thus slow stuff down (slightly). Just thinking about how generic we can make the request/response object layout so that it's flexible enough for other use cases too.

                  I'm thinking people renaming one package to [the previous name of another package], is something you'll need to deal with in Soundflow? (mabe by not allowing such renames?) And it's okay if, in Talkyard, if a category Bbb gets renamed to [a previous name of another category Aaa], then Ty stops the old url path to Aaa from redirecting to the new path to Aaa, and instead that url path will be to Bbb (although it was previously used by Aaa).

                  Agree completely.
                  This also confirms that the API caller (SoundFlow in this case) should be in charge of creating slugs since that puts the responsibility on us for detecting identical URLs and providing a UI for the user or an algorithm to make sure that it doesn't happen.
                  The opposite approach would be for the slug algo to sit with Talkyard and then TY would just append numbers to slugs in case something already existed at the same address/path - but that would give the API caller less control.

                  "any limitations for extId strings length wise?" — I have in mind an external-import-id 100 char length restriction (i.e. > 60).

                  Sounds great!

                  "Which characters are allowed in the slug" — I have in mind lowercase alpahumeric + hyphen - ok inside (but not starting or ending with -) and at least one letter (not only digits).

                  Sounds great!

            2. C
              In reply toKajMagnus:
              Christian Scheuer @chrscheuer2019-07-29 01:33:14.312Z

              @KajMagnus let me know where you think we're at in the review process.
              I feel like we're pretty close to this being possible to implement in a v1 (v0), right?
              I'm asking since we still have a window to include this in our big next release :)

              But actually the reason I opened up this forum today was to let you know that using Talkyard for Soundflow has been a major shift for us in being able to handle support requests, building up our knowledge base, handling bugs etc. I'm so happy using it every day. If you want us to help promote it or you need some testimonials, I'd be very happy to help.

              1. KajMagnus @KajMagnus2019-07-29 08:29:57.319Z

                Hi Christian, I've written the upsert-categories code (just merged into the master branch), and the automatic tests; seems to work fine. I'm about to start code reviewing the API now — this will take about a week (I'll also code review the import-Disqus-comments code which I did together with this.) Probably I'll find things to fix, 1 more week. And after that, I will want to try out the new version, here at Talkyard .io, for maybe half a week, before deploying to Talkyard .net (your server). So about 2.5 weeks before the upsert-categories endpoint is live in your forum, I'd think.

                B.t.w. I also partly implemented sub categories, so you'll be able to upsert into sub categories in a parent "Packages" category with ext id "soundflow_packages", for example.

                Roughly when is that next release?

                Yes a testimonial would be helpful — I've been thinking about asking people about that actually :- ) How do we go about doing that? Maybe if you write something, then I create a Testimonials section on the website and insert?

                1. CChristian Scheuer @chrscheuer2019-07-29 08:50:04.576Z

                  That sounds great!
                  We plan to release around September 1st, but that also means to beta test it we'd need it pretty soon in order to get it in before we lock our branch. I think it sounds close to doable with your time frame, but it's gonna be a little tight. Would there be any way to test our integration before somehow? Maybe via a test forum somewhere?
                  Wrt testimonials - that's entirely up to you. The website thing would be a good place to start, but I'm thinking you should be pretty close to being able to make some public releases / statements about Talkyard (it feels pretty stable and feature complete to me). One of our partners did a nice graphic with logos of all their partners. You could do something similar. Anything that makes it look fancy is good IMO :)

                  1. KajMagnus @KajMagnus2019-07-29 09:32:32.694Z

                    Hmm I'll spend two days doing code review, and then I'll know better. Not impossible that there' won't be so many things to fix. Then I can get back to you.

                    Ok, a public realease / statement — that feels like a good idea to do, together with this new version with export & import. Mabye can combine with your / others logos or sth like that.

                    1. KajMagnus @KajMagnus2019-07-31 13:52:46.132Z

                      @chrscheuer Seems I won't run into any major unexpected things now during code review. Probably I'll have deployed the new version at the end of next week, like, August 9 or 10, and then you can start integrating with it.

                      1. CChristian Scheuer @chrscheuer2019-08-01 15:27:13.511Z

                        That's wonderful news!! Thank you :)

                        1. KajMagnus @KajMagnus2019-08-09 09:46:58.759Z

                          @chrscheuer I just deployed the new version to here at Talkyard .io. I have in mind to upgrade Talkyard .net (your server) tomorrow or on Sunday.

                          Meanwhile, maybe you'd be interested in this end-to-end test of upserting categories — it shows the API in use:
                          https://github.com/debiki/talkyard/blob/02f7ae55ed7f38a3c3c124168bf60f9e41be6e5c/tests/e2e/specs/api-upsert-categories.2browsers.test.ts
                          https://github.com/debiki/talkyard/blob/02f7ae55ed7f38a3c3c124168bf60f9e41be6e5c/tests/e2e/utils/server.ts#L418

                          1. CChristian Scheuer @chrscheuer2019-08-09 10:24:14.939Z

                            That's great @KajMagnus!
                            Perfect timing :) Thank you so much for getting this ready.

                            Just skimmed over the tests. Looks great. One thing I noted, in the API response you don't seem to be expecting a url or a path.
                            Just to be sure: Will the API response return enough for us to be able to construct a URL for the category?

                            1. CChristian Scheuer @chrscheuer2019-08-09 10:28:52.524Z

                              Or maybe the path is deterministic based on the slugs? In any case, the url is the real "answer" we're looking for from the API in 99% of cases. Eg. our site talking to TY would say: I give you the info, you give me the URL, so I can redirect.
                              What happens if you define a slug that has already been used (with the same parent category)?

                              1. KajMagnus @KajMagnus2019-08-09 11:37:18.754Z

                                The URL path is deterministic and depends on the slug, yes. Since the forum's "home page" is at /, then, a category with slug the-cat-name is "placed" at /latest/the-cat-name. However I'd like to change this, to /latest/base-category/sub-category — in your case, /latest/packages/package-name. But right now it's just /latset/package-name.

                                It'd be good if I included the url path to the category, in the response. Then, later when I add base-category-slug/ before the sub category slug, you wouldn't need to update your code (you'd automatically redirect to the correct address, if you use the url path in the response. I didn't think about this until now).

                                What happens if you define a slug that has already been used (with the same parent category)?

                                As of now, slugs need to be unique accross the whole forum (not scoped to base categories — not yet implemented).

                                Upserting a category with the same slug as another cateogry (with a different external id), returns an error. (There's a unique constraint in the database.)

                                If a slug gets renamed — I haven't yet implemented the redirect-to-the-newest-name, so, going to the old slug, will result in "Category not found".

                                1. CChristian Scheuer @chrscheuer2019-08-09 11:43:31.053Z

                                  It'd be good if I included the url path to the category, in the response. Then, later when I add base-category-slug/ before the sub category slug, you wouldn't need to update your code.

                                  Yes I completely agree. It would be best that the API consumer doesn't have to make any assumptions about how to construct URLs at all.
                                  For this round, it's fine to do it ourselves since we're in a hurry to get this in before our launch. But I suspect as soon as you want to introduce the URL change (which of course is a partially breaking change to clients as well) we could have downtime if we don't update our algorithm at the exact same moment the TY server upgrades.
                                  Maybe it's best if you add the feature to return a URL in the API before making any URL changes, so we won't have downtime. (Ie. first version of API released now, 2: introduce URL returning in API, 3: make changes to URLs).

                                  As of now, slugs need to be unique accross the whole forum (not scoped to base categories — not yet implemented).

                                  Gotcha. Not much of a problem right now for us.
                                  But I'm wondering if this will cause a database migration headache for you? If slugs are a unique column now, and we start filling in data, will you need to do some data migration for it to work when slugs are scoped?

                                  Upserting a category with the same slug as another cateogry (with a different external id), returns an error. (There's a unique constraint in the database.)

                                  Great! (With the exception of scoping)

                                  If a slug gets renamed — I haven't yet implemented the redirect-to-the-newest-name, so, going to the old slug, will result in "Category not found".

                                  Good to know.

                                  1. KajMagnus @KajMagnus2019-08-09 12:14:46.752Z

                                    best if you add the feature to return a URL in the API before making any URL changes [...] 1, 2, ,3

                                    Yes that's what I'd like to do. (And will do :- ))

                                    I'm wondering if this will cause a database migration headache for you? If slugs are a unique column now, and we start filling in data, will you need to do some data migration for it to work when slugs are scoped?

                                    I think the migration is going "in the right direction". Meaning, since slugs need to be unique accross the whole forum today, things will work fine also when they need to be unique only in a specific category. ... Migrating in the other direction, though, would be ... headaches.

                                    ***

                                    I'm thinking this new version has what's needed, to start trying things out then? There're some annoying things, but there doesn't seem to be any showstoppers (that we know about), right

                                    I didn't implement the caching endpoint yet. Maybe you could tell me how you would like it to look and work? E.g. suggest an url path and tell me how the reply could look? (I feel a bit unsure about precisely what you need)

                                    1. CChristian Scheuer @chrscheuer2019-08-09 13:26:59.849Z

                                      Great!
                                      Yes this should be enough to get us going for sure.

                                      Wrt caching, let me think about it first. I think it makes sense for us to cache it in our own layer.
                                      Essentially think about it like SSO. We have a customer who navigates to soundflow.org/store/somepackageid. Then they want to go to the forum for that package, they click a link that for example takes them to soundflow.org/store/somepackageid/forum.
                                      Now our server or client (actually we're running serverless so this will go to our API endpoint) will ask TY to essentially just redirect us to the right URL.
                                      Since we're serverless (all content on soundflow.org is statically delivered, and anything dynamic happens clientside - Firebase style) we'll likely cache this information in our server side rendering (React SSR) and pre-render the actual forum links.
                                      And for those that aren't cached, it will just go to our API which will ask TY.

                                      In other words... Having a caching feature at TY is not necessarily going to help us much, since we still need to have an API endpoint (with the API secret to be able to create new categories) to deal with things. So a caching implementation from TY won't help us except for doing pure serverless stuff where all categories are created ahead of time.

                                      So.. In short, let me see if we run into a problem or not, and I'll let you know if I see a space where a TY API caching endpoint would help us, but I suspect that it may not give us anything that we can't make even more performant ourselves.

                                      1. CChristian Scheuer @chrscheuer2019-08-10 14:00:38.645Z

                                        Making ready for the API in our app :)

                                        1. KajMagnus @KajMagnus2019-08-11 08:12:14.951Z

                                          Wow, is that how Talkyard looks, when embedded inside the Soundflow app?
                                          (... Or is it a backoffice app screenshot? or maybe some development env?)

                                          Does that mean embedding Talkyard in an <iframe> works ok now? — Then that seems like a feature I could mention somewhere :- )

                                          1. CChristian Scheuer @chrscheuer2019-08-11 10:58:24.436Z

                                            This is how it will look inside the SoundFlow app yes - for users.
                                            It's not an iframe however, it's a WebView component - so to the forum it looks like it's the top of the hierarchy (eg. there's no window.top). We decided to stay with Electron 3 for some time where that method is still supported. We'll transition to iframe later on because as we talked about there are a number of things that are harder in that method.

                                            1. KajMagnus @KajMagnus2019-08-11 13:00:27.141Z

                                              Ok. It'd be interesting to know when you start experimenting with iframes, and how it works for you. (Some other people also wanted to place Talkyard in an iframe, so seems like a useful feature.)

                                              1. CChristian Scheuer @chrscheuer2019-08-11 13:14:06.450Z

                                                Yea definitely! I'll take you up on this soon.
                                                We may want to embed the package forums in iframes on our website, so I may get to this as soon as this week. But I'll probably be looking for the easy solution this time around.
                                                What I think we'll need is granular control over when to show which kinds of UI elements (top bar, left bar, right bar). This could include a possibility to via the embedding URL to set a classname on the html element, so that any embedded css/html we have in the forum could react to that.
                                                Also some kind of way to know how to either restrict navigation that goes beyond what's embedded (for example a category), or in those cases to get a hook that would allow us to direct that navigation request to somewhere else. In the case of the website, if navigating outside of the category, we'd want the full top frame to go to that new location, discarding the content around the iframe. In the case of the SoundFlow app, we'd redirect the request to a different tab in case of the request still being inside of the forum, and in the case where navigation would take us off-forum, we'd open a browser.

                                                1. CChristian Scheuer @chrscheuer2019-08-11 13:26:52.153Z

                                                  By the way - feel free to ask for more screenshots if you want to show it off as a feature :) I think once we have it fully integrated it would make sense to make a little collection.

              2. Progress
                with doing this idea
              3. @KajMagnus marked this topic as Started 2019-06-23 00:09:08.286Z.
              4. KajMagnus @KajMagnus2019-08-11 08:16:07.707Z

                @chrscheuer Now I just deployed the new version (v0.6.41) with the upsert API. Feel free to try it out. B.t.w. there are rate limits:

                  object UpsertSimple extends RateLimits {
                    val key = "UpSm"
                    val what = "Upserted things too many times"
                    def maxPerFifteenSeconds = 10
                    def maxPerFifteenMinutes = 30
                    def maxPerDay = 90
                

                I you'd like to try out the API on a test site first (instead of your real live site), then go here: https://www.talkyard.io/choose-demo-forum, and click Create my own test forum. Then send me a link with the URL to the test forum, so I can enable the upsert API for that test forum too.

                1. C
                  Christian Scheuer @chrscheuer2019-08-11 11:21:11.408Zreplies toKajMagnus:

                  Awesome! Looking forward to trying it out. Thank you for pushing this!
                  Wow - the rate limits definitely will make sure that we can't use this as the primary method for users to get to the forum (that would limit us to 90 page views of a forum per day). So with this we'll need to implement our own caching for sure.
                  I'm a bit afraid to hit that daily or 15 min limit just in debugging when implementing the system. Since we'll need to do our own caching this also means I'll need to run a bulk operation once we have completed the integration, one that runs over all the packages in our system. I'm pretty sure we could hit those limits pretty easily once we want to run the bulk operation.
                  This essentially means we'll need to implement queueing on our end.

                  As I see it there are a couple of ways this can work:

                  1. Talkyard has a rate limited API. This forces us to have a database trigger (when user publishes a package, we register a slug, we talk to TY to get a forum url for that slug, this is stored)
                    When user wants to navigate to the forum, we'll know it already works. If the TY API call failed, we'll need to either have it add the request to a queue that manages to keep us below the rate limit (a lot of extra work), or build in workarounds to access the API when the user tries to go to a forum that didn't get properly created. This could in turn mean even more API access leading to cascading errors of the whole system.
                  2. Talkyard has a non-cached, non-rate-limited API (or the limit is high enough we won't see problems). We still have a database trigger - so we access the API only once per package publishing. Effectively the queueing concern will be gone so no need to build in that or any workarounds. The effective number of requests to the API would be the lowest with this approach.
                  3. Talkyard has a cached, non-rate-limited API. Since SoundFlow still needs to provide the slugs, that's a potentially slow operation, so we'd still like to do this not on a pageview level but as part of the database trigger.
                    So the Talkyard internal caching would here just be an implementation detail of the TY system and lead to potential better usage of server resources. However we'd have the exact number of API calls as in #2, just 1 per package creation.
                  4. (Rather 3b) SoundFlow splits up our system so that we make sure to define unique slugs as part of our database trigger (when user publishes a package), but then utilize a Talkyard caching endpoint to use it for every pageview/redirect operation. This would have TY API see the most traffic. We would send the same data as now but instead of for every package creation, we'd send for every page view / redirect from main site package to forum package category (same style as SSO).

                  If we did not need to provide slugs, the system could be simplified for the API consumer, and a potential cached solution from TY would have even more value than #4. Because this would allow us to use the API just like SSO and we wouldn't need to have our own separate algorithm for slugs. Essentially we could use it then more or less use it as part of a redirect for the user, whenever such a forum url would be accessed. We would in this case just send an extId and a name (and parentRef) and get back an url. All stuff that wouldn't require us to have any queueing or database triggers. This would be almost like embedded categories.


                  The issue here IMO is in putting a good distinction between what's the role of the API and the API consumer. With these pretty low limits it's making some scenarios that should be simple for us to implement much more complex. I'm a bit reluctant to go too far down that rabbithole if we can come up with a better solution.

                  1. C
                    Christian Scheuer @chrscheuer2019-08-11 11:23:07.592Zreplies tochrscheuer:

                    In essence what I'm afraid is that whereas this API would be called as often as SSO and have no bigger cost on the system than SSO when used as a Category upsert function, the API was designed to be able to do bulk operations, which of course you'd probably need to rate limit.
                    One idea: Would it make sense to lift the limit (or put it same as SSO) for simple queries that just contain a single category?

                    1. C
                      Christian Scheuer @chrscheuer2019-08-11 11:40:32.605Zreplies tochrscheuer:

                      Sorry if any of this sounded too critical :) I'm very happy we've gotten this far! And this should definitely be something we can work with. Via our Firebase setup we could do queueing etc if really needed. So we will get something great eventually, no matter what. This is good as a 1st API iteration.
                      But since what we're trying to do is essentially a specialized version of embedded categories - I think it doesn't hurt to talk about if the API can move towards something equally simple to that (which would be essentially completely URL based I suppose).

                      1. KajMagnus @KajMagnus2019-08-11 12:59:03.629Zreplies tochrscheuer:

                        whereas this API would be called as often as SSO and have no bigger cost on the system than SSO when used as a Category upsert function, the API was designed to be able to do bulk operations, which of course you'd probably need to rate limit.

                        One idea: Would it make sense to lift the limit (or put it same as SSO) for simple queries that just contain a single category?

                        Yes, this sounds like a good idea. Sometimes the /-/v0/upsert-simple requests wouldn't need to write to the database at all. Probably the requests can be optimized to access only the in-memory cache. — Maybe there could be different rate limits: 1) one for requests that upsert just a few items (e.g. one single category) and that actually didn't need to upsert anything, didn't write to the database. And 2) another rate limit, for requests that do write to the database, but just a few items (like, one category). And 3) a third rate limit, for bulk upserting lots of things.

                        What rate limits do you think would work for you, for the initial import of everything?

                        So you won't need to spend time writing your own queue to stay below the limits, and so you won't need to do a bulk operation, ... But instead can upsert everything, one request at a time (that's simplest for you?) as fast as you want?

                        (Maybe this would be approximately the number of packages you have? Per 15 seconds?) ...

                        ... Maybe I can just bump the limits a bit (or a lot) for now, since currently one needs to ask for permissons to use this API endpoint anyway. And I can restrict /-/v0/upsert-simple to only accept a single category — and reject bulk uploading lots of things. And then later on, I can add more restrictive rate limits, for API calls that bulk uploads lots of things, and start allowing that.

                        (I'll read everything you wrote in more detail a bit later)

                        1. C
                          Christian Scheuer @chrscheuer2019-08-11 13:08:55.461Zreplies toKajMagnus:

                          Thanks @KajMagnus for your quick reply!
                          Great - sounds like we agree on it :)

                          (Maybe this would be approximately the number of packages you have? Per 15 seconds?) ...

                          Yea that would be great to get started. And yes exactly since we won't be running bulk operations normally, I'd prefer to implement this as a per-package request (so no bulk API requests), and then just trigger them manually. This would test the integration AND get the categories created in the same operation.
                          We have 57 packages right now.
                          I think most of these will see very little traffic in the beginning, but I expect the pace to ramp up this fall (we launch version 3 on September 5). So I think our case will be a good real life example to stress test the performance of the server.
                          This, of course, is perhaps the single biggest benefit of Talkyard (besides the fact that you're so available) - the speed! So we definitely don't wanna be the cause of a performance slowdown.

                          ... Maybe I can just bump the limits a bit (or a lot) for now, since currently one needs to ask for permissons to use this API endpoint anyway. And I can restrict /-/v0/upsert-simple to only accept a single category — and reject bulk uploading lots of things. And then later on, I can add more restrictive rate limits, for API calls that bulk uploads lots of things, and start allowing that.

                          This sounds perfect. Once we're done testing/implementing, since I'll do the slug creation and cache all the API results, we'll revert back to close to zero requests/day. The 57 packages we have were created over the course of many months, so we won't need much API traffic after our initial runs and integration tests (so yea even having high limits won't affect the server if we're the only ones you've let in)

                          1. C
                            Christian Scheuer @chrscheuer2019-08-11 13:24:57.331Z

                            I've created a demo site now here:
                            https://test--soundflowtest.talkyard.net

                            Would be great to test the integration here first :)

                            1. C
                              Christian Scheuer @chrscheuer2019-08-11 13:35:01.895Zreplies tochrscheuer:

                              Is this still the definition for slugs?

                              "Which characters are allowed in the slug" — I have in mind lowercase alpahumeric + hyphen - ok inside (but not starting or ending with -) and at least one letter (not only digits).

                              1. C
                                Christian Scheuer @chrscheuer2019-08-11 13:38:48.592Zreplies tochrscheuer:

                                Looking at this for a slugify function. Do you see any problems with it?

                                function slugify(string) {
                                  const a = 'àáäâãåăæąçćčđďèéěėëêęğǵḧìíïîįłḿǹńňñòóöôœøṕŕřßşśšșťțùúüûǘůűūųẃẍÿýźžż·/_,:;'
                                  const b = 'aaaaaaaaacccddeeeeeeegghiiiiilmnnnnooooooprrsssssttuuuuuuuuuwxyyzzz------'
                                  const p = new RegExp(a.split('').join('|'), 'g')
                                
                                  return string.toString().toLowerCase()
                                    .replace(/\s+/g, '-') // Replace spaces with -
                                    .replace(p, c => b.charAt(a.indexOf(c))) // Replace special characters
                                    .replace(/&/g, '-and-') // Replace & with 'and'
                                    .replace(/[^\w\-]+/g, '') // Remove all non-word characters
                                    .replace(/\-\-+/g, '-') // Replace multiple - with single -
                                    .replace(/^-+/, '') // Trim - from start of text
                                    .replace(/-+$/, '') // Trim - from end of text
                                    .replace(/^\d+$/, m => `category-${m}`) //Prepend "category-" if we had a number-only slug
                                }
                                
                                1. C
                                  Christian Scheuer @chrscheuer2019-08-11 15:14:10.682Z

                                  I appear to have it working now!

                                  https://forum.soundflow.org/latest/logic-pro-x

                                  Was just created via the upsert API :)
                                  Edit: and it was triggered by our full package publishing pipeline. Our cache layer and slug generator etc. seems to work as expected :) Yay!

                                  1. C
                                    Christian Scheuer @chrscheuer2019-08-11 15:22:01.432Z

                                    I decided to manually trigger the creation for now. Other users' packages' forums will be created next time they publish a new version of their package (or I'll trigger some more manually). Just so we didn't run over the limit here today.

                                    Everything appears to be working as expected.
                                    One little note - maybe there should be a limit UI wise on the number of sub categories shown (once we have 57 packages here it could become crowded):

                                    1. C
                                      Christian Scheuer @chrscheuer2019-08-11 15:25:30.645Z

                                      I'm inserting all these without setting the "position" property.
                                      Is there any chance automatic sorting could be implemented on the TY side, in case categories don't have positions?

                                      1. C
                                        Christian Scheuer @chrscheuer2019-08-11 21:03:58.516Z

                                        I have implemented the automatic categories across all of our packages now.
                                        I'm trying to embed the forum in an iframe here, but the SSO process seems to redirect the top window at some point in the process. You can see it in effect here:

                                        https://soundflow.org/store/dialog-editing-izotope
                                        1. C
                                          Christian Scheuer @chrscheuer2019-08-11 21:27:54.485Z

                                          And here it is now working in the app :)

                                          1. KajMagnus @KajMagnus2019-08-13 13:26:02.686Zreplies tochrscheuer:

                                            Hi @chrscheuer,

                                            automatic sorting could be implemented on the TY side, in case categories don't have positions?

                                            I think so — does that mean alphabetically? Or by most-active-first? (I'll look into the SSO-top-window-redirect issue first)

                                            And here it is now working in the app :)

                                            Thanks for the screenshot, I think it looks nicely integrated :- ) (Hmm possibly some minor things to maybe consider some time later, like: the title "soundflow" on the top nav bar with the blue background — maybe that title isn't needed? And the forum title a bit below, again. Maybe some URL parameters could hide the forum title & intro text, + top nav bar? Or a CSS class on the html elem, if is embedded. )

                                            If one navigates to a different category, then clicks a different app tab (say, the "Save Version" tab), and then clicks "Package Forum" again — then, does the forum view get recreated, and one starts in the Online/Offline package again? Or is the forum view remembered, and one is still in that different category?

                                            the SSO process seems to redirect the top window at some point in the process

                                            I think there're two problems: 1) The Package Forum tab link has a redirect url that sends the browser to Talkyard's forum, rather than back to the page at Soundflow. This page: https://soundflow.org/store/dialog-editing-izotope/forum somehow redirects the browser to: https://soundflow.org/forum?returnTo=https%3A%2F%2Fforum.soundflow.org%2Flatest%2Fdialog-editing-izotope — and look at the returnTo=... parameter. It tells SoundFlow to send the browser to Talkyard's domain.

                                            Something like this instead, could work: https://soundflow.org/forum?returnTo=https://soundflow.org/store/dialog-editing-izotope/forum

                                            Then there's problem 2: Because of maybe-security-issues, Talkyard strips the url origin from the_return_to_url in: http://localhost/-/v0/sso-login?oneTimeSecret=nnnnn&thenGoTo=the_return_to_url. So even if Talkyard gets an url with the correct origin (i.e. https://soundflow.org/store/dialog-editing-izotope/forum), Talkyard will refuse to use it, and only looks at the url path (/store/dialog-editing-izotope/forum) instead. — I previously didn't want to allow other url origins than the Talkyard server itself, because of maybe-pishing attacks.

                                            Here's Talkyard's code that runs when calling http://localhost/-/v0/sso-login?oneTimeSecret=nnnnn&thenGoTo=https://soundflow.org/store/dialog-editing-izotope/forum

                                                    // Remove server origin, so one cannot somehow get redirected to a phishing website
                                                    // (if one follows a link with an "evil" go-next url param to the SSO login page,
                                                    // which then redirects to this endpoint with that bad go-next url).
                                                    val thenGoToUnsafe = getOnly("thenGoTo")
                                                    val thenGoToHashEncoded = thenGoToUnsafe.flatMap(Prelude.stripOrigin) getOrElse "/"
                                            

                                            That protects against someone who crafts an evil link:

                                            https://soundflow.org/forum?returnTo=https%3A%2F%2Fwww.evil.com
                                            

                                            Then, the user would login over at soundflow.org — the real domain — and believe s/he is safe? But would then get sent to an evil server, possibly phishing. Well, unless one or both of SoundFlow and Talkyard detects and rejects unexpected redirect-to destination domains.

                                            Maybe there could be a whitelist of allowed go-to-after-SSO-login origins? There's a setting allowEmbeddingFrom, intended for embedded comments. Maybe Talkyard could allow redirecting to the allowEmbeddingFrom origins after SSO login?

                                            1. KajMagnus @KajMagnus2019-08-13 18:14:08.054Zreplies tochrscheuer:

                                              One little note - maybe there should be a limit UI wise on the number of sub categories shown (once we have 57 packages here it could become crowded):

                                              Hmm, yes that seems like a probably good idea. Maybe a max height, + a button "Click to show" ? I could look into this a bit later, after the iframe SSO redrect, and rate limiting issue, and returning the category url path.

                                              Another approach could be to place the Packages category last, so it'll be at the bottom of the page. Then maybe doesn't matter if there're a lot of sub packages and the Packages section gets really tall? Since there'd be nothing below

                                              1. KajMagnus @KajMagnus2019-08-13 18:36:02.708Zreplies tochrscheuer:

                                                Is this still the definition for slugs?

                                                Yes. I'm not sure about foreign langage chars, e.g. Chinese? But maybe for now, non-ASCII chars can be forbidden, ... until after a bit research. I'm thinking only [a-z0-9-] is safer, for a start.

                                                Looking at this for a slugify function. Do you see any problems with it?

                                                I had a not-thorough look ( + I'm tired :- )), and I think it looks fine, mainly because of this line: .replace(/[^\w\-]+/g, '') which sort of deletes all "weird" things ... since \w = [A-Za-z0-9_] says MDN. B.t.w I have a slugify function in Talkyard, which converts titles to slugs. Here it is:

                                                https://github.com/debiki/talkyard/blob/dd46d893a2d85dfaaf0d89fa0c599bf94aadb35c/client/third-party/non-angular-slugify.js

                                                Seems there's a bug in my slugify — it allows pure numeric category slugs. Hmm your slugify accepts _, Talkyard's doesn't. Maybe it's better if I change Talkyard and allow _ in slugs.

                                                (Talkyard's slugify is based on an AngularJS plugin or something. I was using AngularJS long ago before moving to React.)

                                                1. C
                                                  Christian Scheuer @chrscheuer2019-08-13 18:39:30.253Zreplies toKajMagnus:

                                                  Maybe a max height, + a button "Click to show" ? I could look into this a bit later, after the iframe SSO redrect, and rate limiting issue, and returning the category url path.

                                                  Yea that sounds great. Haha yea this did turn up quite a list of follow up items :)
                                                  For now we're completely good with the current solution, everything seems to work as expected.

                                                  I had to hard program the URLs around in our site and since we're caching them and prerendering the URLs to the site's static content (which means Google is also picking it up) we'll likely experience a little down time on forum links once you make the change. We may also have customers bookmarking or deep linking to the current URLs. So maybe before the URLs change it would make sense to look at redirects (history). We can live without it, but of course it would make the user experience a little better (and make sure we don't take SEO hits). Implementing redirects would be more important to us than returning URLs in the API.

                                                  Another approach could be to place the Packages category last, so it'll be at the bottom of the page. Then maybe doesn't matter if there're a lot of sub packages and the Packages section gets really tall? Since there'd be nothing below

                                                  Yea I'll look into that if we get an explosion of packages. But actually I prefer it to be up at top for the moment, since we primarily want users to use either the "How to" section or the "Packages" section, so hiding the content below is not a big issue for now - just something to think about in the future.
                                                  I think your max height + "click to show" sounds like a good solution and I agree with your prioritization to take this task last :)

                                                  1. C
                                                    Christian Scheuer @chrscheuer2019-08-14 03:37:44.772Zreplies tochrscheuer:

                                                    I just realized something.. Since you were talking about the slugs right now being global (not scoped).
                                                    Is there any potential for us to accidentally hit slugs that were created by talkyard in other categories, pages or posts (not synced from external)? We of course guarantee the slugs we use are unique, but if you just have a single index for everything, won't we risk accidentally using slugs that were already used by others?

                                                    1. KajMagnus @KajMagnus2019-08-14 05:36:38.618Zreplies tochrscheuer:

                                                      There're just a few default categories and slugs, like: staff, support, ideas and general or uncategorized. I'm thinking there're no SoundFlow packages with those names? If nevertheless there is a category slug collision, you could rename the "built-in" category slug to something else.

                                                      1. C
                                                        Christian Scheuer @chrscheuer2019-08-14 06:05:29.941Zreplies toKajMagnus:

                                                        There're just a few default categories and slugs, like: staff, support, ideas and general or uncategorized. I'm thinking there're no SoundFlow packages with those names? If nevertheless there is a category slug collision, you could rename the "built-in" category slug to something else.

                                                        Oh ok. It's not as much a practical problem right now no since we don't have packages with those names, but I think it's problematic in the long run since the API call will essentially fail even though we have no way of knowing which categories we can't hit. Our users certainly would have no way of knowing that - so if a user today creates a package called Support they just wouldn't get a forum for that package.
                                                        It's a side effect of the API design of putting the slug creation into the hands of the API consumer, so it's spilling complexity out instead of containing it. The fact that we have to cover all potential edge cases of our slug algorithm (you just found a bug in ours) I think could lead to more errors down the road.
                                                        To me this makes me feel like maybe it should be revisited if slugs long term really should be the API consumer's responsibility. It makes sense for me that we have slug responsibility if we have full control over all input, but if we can collide with data we don't know about, then it makes the whole system very complex (ie. we would have to download all data and put it in our own tables just to know which slugs are free - and this would not work on high loads since new slugs could have become used since we last downloaded everything).

                                                        1. C
                                                          Christian Scheuer @chrscheuer2019-08-14 06:09:04.900Zreplies tochrscheuer:

                                                          But yes just to be clear - this is more of a theoretical problem than a practical one at this point. If it's only the few default ones we can collide with it's not high priority for now, but I think this slug responsibility should be part of revisiting the API wrt URL returning.
                                                          Or maybe it should just be part of the API spec - ie. clarify which reserved words are not allowed in the slug we provide. Although this could still lead to problems with categories defined by users in other ways - so I still think we should reconsider if slugs would be better handled by TY.

                                                          1. C
                                                            Christian Scheuer @chrscheuer2019-08-14 06:10:52.404Zreplies tochrscheuer:

                                                            I had a not-thorough look ( + I'm tired :- )), and I think it looks fine, mainly because of this line: .replace(/[^\w-]+/g, '') which sort of deletes all "weird" things ... since \w = [A-Za-z0-9_] says MDN. B.t.w I have a slugify function in Talkyard, which converts titles to slugs. Here it is:

                                                            https://github.com/debiki/talkyard/blob/dd46d893a2d85dfaaf0d89fa0c599bf94aadb35c/client/third-party/non-angular-slugify.js
                                                            Seems there's a bug in my slugify — it allows pure numeric category slugs. Hmm your slugify accepts _, Talkyard's doesn't. Maybe it's better if I change Talkyard and allow _ in slugs.

                                                            Thanks for finding this bug! I think your design is cleaner by not allowing underscore - I'll fix it on our end.

                                                            1. KajMagnus @KajMagnus2019-08-16 04:30:43.095Z

                                                              @chrscheuer

                                                              We may also have customers bookmarking or deep linking to the current URLs. So maybe before the URLs change it would make sense to look at redirects (history). We can live without it, but of course it would make the user experience a little better (and make sure we don't take SEO hits). Implementing redirects would be more important to us than returning URLs in the API.

                                                              Ok, so I'll implement category redirects from old names to the current name, before moving sub category url paths to [below their parent category's url path].

                                                              To me this makes me feel like maybe it should be revisited if slugs long term really should be the API consumer's responsibility

                                                              What if, when one upserts a category, if one specifies a title only, but no slug, then Talkyard can auto generate the slug?

                                                              I'll include this in the /-/v0/upsert-simple response:

                                                              {
                                                                categoreis: [{
                                                                  ...
                                                                  urlPaths: {
                                                                    activeTopics:  '/latest/category-slug',
                                                                    topTopics:  '/top/category-slug',
                                                                    newTopics:  '/new/category-slug',
                                                                  },
                                                                  ...
                                                              

                                                              So, after having upserted a category, you can send the browser to:
                                                              serverOrigin + responseJson.categories[0].urlPaths.activeTopics.

                                                              1. C
                                                                Christian Scheuer @chrscheuer2019-08-16 18:53:58.677Zreplies toKajMagnus:

                                                                Ok, so I'll implement category redirects from old names to the current name, before moving sub category url paths to [below their parent category's url path].

                                                                Sounds great!

                                                                What if, when one upserts a category, if one specifies a title only, but no slug, then Talkyard can auto generate the slug?

                                                                This would be great. And I trust when Talkyard auto-generates it would always be valid, right? Meaning if there's a collision, it itself handles incrementing a counter at the end?

                                                                So, after having upserted a category, you can send the browser to:
                                                                serverOrigin + responseJson.categories[0].urlPaths.activeTopics.

                                                                Sounds good!

                                                                1. KajMagnus @KajMagnus2019-08-21 14:46:08.908Z

                                                                  Hi @chrscheuer, now I've released a new version — so now you can do: categories[0].urlPaths.activeTopics. I also bumped the rate limits:

                                                                    object UpsertFew extends RateLimits {     // <—— your case
                                                                      val key = "UpFw"
                                                                      val what = "Upserted a few things too many times"
                                                                      def maxPerFifteenSeconds = 150  // fairly high burst
                                                                      def maxPerFifteenMinutes = 600
                                                                      def maxPerDay = 2400  // but not super many, per day
                                                                  
                                                                  ...
                                                                  
                                                                    object UpsertMany extends RateLimits {
                                                                      val key = "UpMn"
                                                                      val what = "Upserted many things too many times"
                                                                      def maxPerFifteenSeconds = 10
                                                                      def maxPerFifteenMinutes = 30
                                                                      def maxPerDay = 90
                                                                  

                                                                  when Talkyard auto-generates it would always be valid, right? Meaning if there's a collision, it itself handles incrementing a counter at the end?

                                                                  Yes.

                                                                  ***

                                                                  What are the next most important things to do? Is it making Talkyard work in an iframe?

                                                                  Remaining to do list: (did I forget something?)

                                                                  • Redirect old category name, to new name.
                                                                  • Place sub categories below parent categories, in the ulr: /latest/base-category/sub-cat.
                                                                  • Change the database, so sub category names need to be unique only within their parent category.
                                                                  • Set a max height on the sub categoires list, + a "Click to show" button.
                                                                  • Sort sub categories alphabetically, in the select-category dropdown.
                                                                  • Fix the iframe redirect problem, so SSO will work also in embedded categories.
                                                                  1. C
                                                                    Christian Scheuer @chrscheuer2019-08-25 09:11:01.219Zreplies toKajMagnus:

                                                                    Great stuff!
                                                                    I've been offline for a week, and will be for another, so won't get a look at this until early September. We're launching SoundFlow 3 on September 5 so right now we're very zoned in on the launch process.

                                                                    Thanks for asking my opinion about the order.

                                                                    I would say the order in your items right now makes sense (the first 3 ones are in the proper dependency order IMO), except for the iframe redirect problem which I would address earlier (it feels more like a bug where the rest is more feature work).

                                                                    One thing I'm not sure if you addressed yet is the TY slug generation if slug is null/undefined. I would probably place that somewhere around the top of the list (right after fixing the iframe problem).