blog post image
Andrew Lock avatar

Andrew Lock

~11 min read

Using Octokit.GraphQL to interact with the GitHub discussions API

In my previous post, I described the overall process of migrating comments from Disqus to giscus using a small .NET program. I described how I exported the comments to XML, cleaned the data, and converted it to a format I could use to create the comments on GitHub discussions.

In this post, I describe how to interact with the GitHub discussions API from a .NET app using the Octokit.GraphQL NuGet package. I show how to search and fetch discussions, how to create discussions, and how to create comments.

I was really hoping that I would have finished migrating all the comments by now. Unfortunately, GitHub have blocked the account I was using to migrate, so the comments still aren't visible on the discussions yet 🙁 More on that later…

REST API vs GraphQL

GitHub actually has two distinct APIs you can use to interact with it:

The REST API is pretty much exactly what you'd expect. It has the typical resource-based architecture using GET/POST/PATCH/DELETE etc to interact with each resource.

If you're not already familiar, GraphQL is a data query language that is designed to make certain API interactions easier than using REST. You can use it to retrieve multiple resources in a single request, for example/ It also helps with only returning the data you're interested in instead of over-fetching all the data for a resource.

Some of GitHub's API are available in both the REST API and GraphQL API, while others are only available in one or the other. The GitHub discussions API is only available in the GraphQL API, so we have no choice in this case, we must use the GraphQL API.

Installing Octokit.GraphQL and creating a connection

The first step is to install the Octokit.GraphQL NuGet package into your app.

Note that there is also an Octokit package which is for interacting with the REST API.

The bad news is this package seems strangely undersupported to me. The current release is still in "beta", despite being first released 5 years ago, and getting an update only every ~6-9 months or so. It seems a bit surprising to me, given this is the "official" way to interact with half of GitHub's API 🤔 If it's stable (and not neglected), fair enough, but I can't believe GitHub hasn't evolved it's GraphQL API in that time… That said, I didn't run into any issues, so maybe it's fine.🤷‍♂️

To install the package into your project, use:

dotnet add package Octokit.GraphQL --version 0.2.0-beta

To execute a GraphQL query or a mutation, you need to create a Connection, supplying a user-agent string (this can be anything), and an API token for the account to use to send requests:

var connection = new Connection(
  new ProductHeaderValue("DisqusToGiscusConverter", "1.0.0"), // The user-agent, can be anything
  "ghp_XRH43FvHhFDdefF432"); // An API key, created using the GitHub API

For the purposes of converting my Disqus comments to giscus, I created two connections:

  • One for my main andrewlock GitHub account
  • One for a "bot" account, disqus-bot that I created for the purposes of the migration.

The idea was that I would use my account to post Disqus comments and replies that I had made, and I'd use the bot account to post comments and replies from other people. That was the theory. As it turns out, creating a new account and sending hundreds of comments is a good way to have GitHub block your account 🤦‍♂️.

Either way, you'll need to create an access token for interacting with the API. You can do this from your Github profile page by clicking Settings > Developer Settings > Tokens (classic) > Generate New Token (classic). As far as I could tell, you need to make sure the token has repo permissions so you can create discussions:

The create token permissions screen on GitHub

Pass the generated token to your Connection constructor. Remember not to hard-code the token directly in your app like I did in the above snippet: make sure to pass it as a command line argument/environment variable/secret.

Interacting with GitHub's GraphQL API

One of the benefits of GraphQL is that it allows introspection of the data, which makes it easier to build rich-tooling around the API. This commonly manifests as "playground" apps, which give you a GraphQL connection to an API in a browser, and let you explore the available APIs.

GitHub has just such a playground at https://docs.github.com/en/graphql/overview/explorer. You can sign-in with your GitHub account and start exploring the API. You'll get "intellisense" like-behaviour for each of the available fields, (by pressing ctrl+space).

GitHub's GraphQL playground

To interact with the discussions API, we either need to make a query, which returns data (analogous to a REST API's GET), or a mutation which also changes data (analogous to a REST API's other verbs).

The Octokit.GraphQL C# API is designed to model the underlying GraphQL structure very closely. I found this pretty confusing at first, but when you pair it with this playground, it actually makes a lot of sense. For the rest of the post I show various queries you can make for interacting with the discussions API specifically.

Checking the rate limit

An important part of working with the GraphQL API is the rate limit. GitHub measures the complexity of each GraphQL call and assigns it a number of "points". You're limited to 5,000 points per hour. There's details on how to calculate the number of points a GraphQL call will use in GitHub's documentation.

You can also retrieve the rate limit details from the GraphQL API directly (this doesn't count against your rate limit quota). The previous screenshot of the GraphQL playground shows how to do this using the API. In Octokit.GraphQL, the code you use (intentionally) looks very similar to the underlying GraphQL data:

var query = new Query() // Every Octokit.GraphQL call starts with a Query or Mutation
    .RateLimit()
    .Select(x => new  // Choose which values to return using an anonymous object
    {
        x.Limit,   // More fields are also available
        x.Remaining,
        x.ResetAt
    })
    .Compile();

// Execute the query using the provided connection
var results = await connection.Run(query);

Console.WriteLine($"The connection currently has {results.Remaining} of {results.Limit} points remaining. Resets at {results.ResetAt:T}");

When executed, this prints something like

The connection currently has 4990 of 5000 points remaining. Resets at 10:07:20 PM

The code snippet shown in this example is representative of how all Octokit.GraphQL calls are structured:

  • Start by creating a new Query or Mutation object.
  • Select the API features by calling the fluent interface, mirroring the desired underlying GraphQL API structure.
  • Use LINQ Select() to select the fields to return.
  • Call Compile() to generate the query to execute.
  • Pass the query to Connection.Run() to execute it.

Now we have the basics under our belt, we'll work through all the individual queries I needed to work with the discussions API.

Fetching the repository ID

Most of the discussion APIs require a "repository ID" parameter. This is a unique string ID for the repository, something like "R_kgDOHvS21g". The following query shows how you can provide parameters to the request, passing in the repository name and the repository owner:

var query = new Query()
    .Repository(name: "blog-comments", owner: "andrewlock")
    .Select(x => x.Id)
    .Compile();

ID id = await connection.Run(query);

The ID type is a simple wrapper around the string ID; it's not quite a strongly-typed ID, as the same type is used for all IDs, whether it's a repository ID, a user ID, or a discussion ID.

Fetching discussion categories

When you create a discussion in GitHub, you create it in a category. Giscus recommends that you create comment discussions in the Announcements category, so that only maintainers (and the giscus app) can create them. So that means you need a category ID.

Rather than figure out how to search for a specific category, I opted to just retrieve all the categories and search client side for the Announcements category.

 var query = new Query()
    .Repository(name: "blog-comments", owner: "andrewlock")
    .DiscussionCategories(first: 10) // retreive the first 10 categories
    .Nodes
    .Select(x => new
    {
        x.Id,
        x.Name,
    })
    .Compile();

var result = await connection.Run(query);

ID id = result.Single(x => x.Name == "Announcements").Id;

The slightly incongruous Nodes element in the middle is characteristic of the GitHub API, along with Edges which you'll see later. A node is a generic term for an object, in this case, a discussion category. An edge is a connection between nodes, and contains the important cursor field, which is used for pagination, as you'll see later.

Searching for a discussion

One of the first steps in migrating from Discus to giscus is checking whether there is an existing discussion for a blog post or if we need to create a new one. As I've described previously, giscus uses the SHA1 hash of the URL path when doing strict matching, so we need to search in the discussion body.

The following snippet shows how I searched my repository for a search term, searching in the discussion body. I only expect zero or one discussions to match, so I retrieve a maximum of two, to make sure we haven't accidentally made the query too general.

                     👇search term  👇 repo to search             👇where to search
string searchTerm = "24323a2b3c346e repo:andrewlock/blog-comments in:body" ;

var query = new Query()
    .Search(searchTerm, SearchType.Discussion, first: 2)
    .Select(search => new
    {
        search.DiscussionCount,
        Discussion = search
            .Nodes
            .Select(node => node.Switch<DiscussionSummary>(
                when => when.Discussion(
                    discussion => new DiscussionSummary(
                        discussion.Title,
                        discussion.Body,
                        discussion.Id,
                        discussion.Number
                )))).ToList()
    }).Compile();

var result = await connection.Run(query);

if (result.DiscussionCount > 1)
{
    throw new Exception($"Expected to find 1 discussion, but found {result.DiscussionCount}");
}

var discussion = result.Discussion.SingleOrDefault();

record DiscussionSummary(string Title, string Body, ID ID, int Number);

As shown above, we use a "generic" search interface, and narrow the search to only discussions using SearchType.Discussion. The GraphQL API here uses a "switch" pattern, as the same API is also used to search for discussions, issues, users, and repositories. In Octokit.GraphQL, that becomes a Switch<T>() method with a corresponding Discussion() or Issue() method (for example), along with a field selection method.

Note that the DiscussionSummary record in this example is a simple DTO. Unlike most of the APIs, you can't use an anonymous type with the Switch<T> method. You have to use a concrete type that you can specify as the generic argument.

Fetching all discussions

My initial approach to searching for a discussion used the search API above, but with hundreds of blog posts to search for, I quickly ran into rate limiting issues. To work around that, I decided to take a different approach—retrieve all the discussions from the API, and search for the specific discussion locally instead.

ID categoryId = ""; // Returned from previously API call
var results = new Dictionary<string, DiscussionSummary>();
string? cursor = null; // The cursor keeps track of the last discussion returned
var orderBy = new DiscussionOrder // Sorting in ascending creation date to ensure we fetch everything
{
    Direction = OrderDirection.Asc, 
    Field = DiscussionOrderField.CreatedAt
};
while (true)
{
    var query = new Query()
        .Repository(name: "blog-comments", owner: "andrewlock")
        .Discussions(
            first: 100, // Fetch the maximum allowed
            after: cursor, // Continue from where we left off
            categoryId: categoryId, // Only discussions in the "Announcements" category
            orderBy: orderBy) // Sort in asecnding created order
        .Edges.Select(e => new
        {
            e.Cursor, // The cursor keeps track of where we're up to
            e.Node.Title,
            e.Node.Body,
            e.Node.Id,
            e.Node.Number,
        })
        .Compile();

    var result = (await connection.Run(query)).ToList();
    if (!result.Any()) // If we didn't get any results, we've finished
    {
        // Reached the end of the list
        return results.Values.ToList();
    }

    cursor = result.Last().Cursor;
    foreach (var discussion in result)
    {
        // Add the returned results to the dictionary
        results.TryAdd(discussion.Id.Value, new DiscussionSummary(
            discussion.Title,
            discussion.Body,
            discussion.Id,
            discussion.Number));
    }
}

This is one of the most complex queries so far:

  • A Dictionary<> stores the complete list of discussions returned from the API.
  • Returns up to 100 discussions per request (the maximum allowed by the API).
  • The cursor is returned with the results and is used for pagination. You pass the returned cursor with the next request, and the API returns the next 100 results from there. That way you can paginate all the discussions.
  • It keeps looping until we get a request with no results.

Creating a discussion

That's all of the queries covered now, so it's on to the mutations. The first mutation is creating the discussion. Mutations have two parts:

  • An input object that is used during the mutation to create the discussion
  • A list of fields to select from the newly created discussion
async Task<DiscussionSummary> CreateDiscussion(
    Connection connection, string title, string body, ID repoId, ID categoryId)
{
    var mutation = new Mutation()
        .CreateDiscussion(new CreateDiscussionInput
        {
            Title = title,
            RepositoryId = repoId,
            CategoryId = categoryId,
            Body = body,
        })
        .Select(x => new
        {
            x.Discussion.Title,
            x.Discussion.Body,
            x.Discussion.Id,
            x.Discussion.Number,
        });

    var discussion = await connection.Run(mutation);
    if (discussion is not { })
    {
        throw new Exception($"Failed to create discussion for {post.GitHubDiscussionTitle}");
    }
}

I created the discussion body to match the existing giscus discussion bodies using something like this:

public string GetBody(BlogPost post)
{
    return $"""
            # {post.GitHubDiscussionTitle}

            {post.MatchingPost?.Excerpt}

            {post.Url}

            <!-- sha1: {post.Sha1()} -->
            """;
}

Create discussion comment

Once the discussion is created, the last thing left to add are the comments. This uses a similar pattern to the discussion creation:

async Task<DiscussionComment> CreateDiscussionComment(
    Connection connection, ID discussionId, ID replyToCommentId, string commentBody)
{
    var mutation = new Mutation()
        .AddDiscussionComment(new AddDiscussionCommentInput
        {
            Body = commentBody,
            DiscussionId = discussionId,
            ReplyToId = replyToCommentId,
        })
        .Select(x => new
        {
            x.Comment.Id,
            x.Comment.Url
        });

    var newComment = await connection.Run(mutation);
    if (newComment is not { })
    {
        throw new Exception($"Failed to create comment for {discussionId}");
    }
}

And with that, we're all done! Theoretically, it's just a case of putting all the pieces together to convert your comments from Disqus to gisucs. Unfortunately, I still haven't managed to finish that process, as I fell foul of GitHub's hidden rate limits.

Rate limit issues

At the start of this post I mentioned that the GitHub GraphQL API has a rate limit, and showed how to check how much of your race limit you have left. Unfortunately, there's also some hidden rate limits I ran into.

If you send too many requests in quick succession, even if you stay within the "official" rate limit, then you'll get errors saying GraphQL error: was submitted too quickly. This issue is mentioned in a GitHub issue, and the simplest workaround is to wait 2-3 seconds between creating discussions/comments.

And that seemed to be doing the job. Right until my giscus-bot account got locked 😩 Some combination of being a new account, and submitting a lot of requests meant that the account was locked and could no longer send requests. Worse, all of the comments already posted were no longer visible.

Which leaves me with a problem…

I could try posting all the comments again with my own account. But I really don't want to get that one locked too 😬 Instead, I started the appeal and reinstatement process… but it's been 13 days since then, and I've not heard anything, so I'm stuck in limbo right now.

The appeal and reinstatement

So we'll see how it goes I guess🤷‍♂️

Summary

In this post I described how to use the Octokit.GraphQL NuGet package to interact with the Github GraphQL discussions API. I showed how to search for discussions, list categories, and how to create discussions and comments. I also discussed some of the rate limits issues I ran into that you need to be aware of when making lots of requests!

Andrew Lock | .Net Escapades
Want an email when
there's new posts?