TL;DR: The tool is Instagram Lists and it empowered me to clean up the accounts I follow. Source code is here. The tool is a robust, guided hack made to be user friendly.
I started with a problem: I was following too many accounts on Instagram. And when you consider that I have been using the platform since April 2011, it becomes clearer that a good spring cleaning is much needed. Also, psychologically I didn’t like having a significantly higher number of accounts I am following than followers.
Finding solutions & why unofficial
So I started with an internet search of how I could get a list of the accounts I follow, my “followings”. In my case I wanted this list narrowed down to the accounts that don’t follow me so in other words, accounts that are not my friends’. I also wanted to find out which accounts have been inactive for a long time.
I came across this StackOverflow question from 2015, and the accepted answer at the time pointed to Instagram’s APIs. However, since the 2018 Facebook–Cambridge Analytica data scandal, their official APIs became very restrictive. Facebook, the company, owns Instagram if you were not already aware. The data scandal involved Cambridge Analytica obtaining info about the friends of the users of an app that Cambridge Analytica made, even if these friends never used the app.
As of writing this, Instagram’s official APIs can only provide info on the current user’s media; nothing on their followers and followings so it would be impossible to build a tool that solves my problem using these APIs.
Luckily, there are some more recent answers to the StackOverflow question that show the use of Instagram’s unofficial APIs that are used by their own website. These APIs must be called with the right authorization so must be executed from within the instagram.com browser tab after the user has logged in. So effectively, these APIs provide a way to automate what I would otherwise do manually such as scrolling the entire list of accounts I follow.
I should note that in 2020, Instagram introduced Least Interacted With list under following list that says:
Review accounts you’ve interacted with the least in the last 90 days, such as liking their posts or reacting to their stories.
I found the list partly okay but overall not a great solution to my problem. This list of 50 accounts is first and foremost not exhaustive. So let’s say you would have to go through it once, unfollow as you wish and then re-enter the list to see the new suggestions at the bottom (that maintain the list at 50 accounts). Secondly, there were several accounts I didn’t agree with them being on the list.
Building
Armed with a will and the StackOverflow answers, I set out to build a client-side JavaScript tool to empower me to solve my problem. My goal for this tool: sort the accounts I follow by when they last posted. The tool is ready when I have solved my problem thanks to it.
Proof of concept
I started with a manual proof of concept that had to accomplish all of these actions:
-
Get lists of all my followers and followings.
https://www.instagram.com/graphql/query/?query_hash=...
-
Get relevant details about some of these accounts (in a loop)
https://www.instagram.com/${username}/?__a=1&__d=dis
The first part was fine, after all, it was the aim of the StackOverflow answers.
The loop of the second part started running okay and then I run into the HTTP error 429 Too Many Requests
.
Consequently, opening the Instagram app (or website also) showed that I needed to confirm my phone number before doing anything else:
This is where I had to start thinking about API rate limits, and understanding that each time I hit the rate limit, I have to confirm my phone number and wait (~30 minutes) before retrying. I was logged-in to my only Instagram account so I risked getting it suspended. I hit rate limits about 5 times and I don’t know at what point Instagram suspends or if they do.
From my testing in November 2020,
the API for an account’s (visible) details (/${username}/?__a=1&__d=dis
) had a limit of 25 requests per 15 minutes so I added a 36-second delay inside the for-loop (15×60/25 = 36).
One small but important caveat is that the logged in user should not browse instagram.com at the same time otherwise the rate limit could be reached.
The tool
So now the question was: how to make this hack as user-friendly as possible? Because I wanted the tool to be 100% web client-side and without needing any installs, automation software like Selenium were out of scope. So I envisioned a web page with the following steps:
- Type your username and open the starting API link (
/${username}/?__a=1&__d=dis
) - Copy JavaScript code (with all functions) into the instagram.com console
- Run function to get all followers and followings
- Paste data into my web page
- Choose which subset to view, for example, followings that don’t follow me back, or all
- For more details per account, select/unselect accounts (and see a time estimate)
- Run function to get more details for the selected accounts
- Paste data into my web page to view a table
You can see the result here. I explain at the start that we need a modern desktop browser and how Safari users can enable developer tools. Each block of code to be copied has a “Copy” button.
There are 2 tables. The first is after getting basic details about all followings and followers. The second is after getting more details about a selection of these. Since the second part with more details is time-consuming, it’s labelled as optional. The first column of the first table has checkboxes that allow selecting accounts for which to get more details on. Below the first table, there is an estimate of the time needed to get more details for the selected accounts.
Finally, I end with a “Clear this page” button because certain fields are stored in the browser tab’s sessionStorage. This is done so the tables persist per tab after refreshing. Behind the scenes, I developed ScopedStorage, a simple wrapper around sessionStorage and localStorage to mimic storage per web page.
Robustness
After building a Minimum Viable Product, the next step was to refine and make it more robust. Also, getting feedback from friends was vital in making sure it works well and is easy to use. Instructions on my web page were made as clear as possible and if the user does a step wrongly, then the code should give a hint as to what the problem is where possible. For example, below a try-catch was added with a hint in case something went wrong.
async function getLists() {
// ...
try {
baseInfo = JSON.parse(document.getElementsByTagName('body')[0].innerText);
} catch (error) {
console.error('You may not be on the right page, normally it should be like "https://www.instagram.com/username/?__a=1&__d=dis"', error);
return;
}
// ...
}
Maximum list sizes
One thing I wondered about was the maximum list sizes of followings and followers that can be retrieved.
So I pretended to be Cristiano Ronaldo with 274 million followers!
I then got another 429 Too Many Requests
error.
The lists are paginated with up to 50 accounts per page (not always 50 in reality)
and I managed 200 pages so that’s up to 10,000 accounts combined followings and followers (including duplicates).
From my testing, I discovered this page limit is per 15 minutes so it would give 4.5 seconds between each call for the next page (15×60/200 = 4.5). I took into account the API’s average latency on my laptop of ~400 ms so I set the delay in the for-loop to 4.1 seconds. You can check HTTP request durations in the Network tab of the developer tools. With a little more testing I found that I could reach 370 pages with this delay so that’s up to 18,500 combined list sizes, so still very far from Cristiano’s millions.
With this info on page limits, I wrote the code to either do 200 total pages without waiting or 370 with the aforementioned delays. However, only the 200-pages method is currently activated.
Handling HTTP errors
With all my testing, I came to know the 429 Too Many Requests
error too well;
so while the other HTTP errors are caught and continue to the next list item, this one is treated differently.
When fetching lists, the function is aborted whereas when fetching more details per account, there is a 30-minute pause and then the failed account is retried.
I also added an abort()
function that can be run to halt any for-loop and get data fetched so far.
Another possibility to consider is that the HTTP status is ok but instead of receiving JSON, you get HTML with a visible message. In this case, the code shows a hint:
async function getLists() {
// ...
try {
response = await response.json();
} catch (error) {
console.error(`Detected that you may need to verify your account. Stopping. Failed at page number ${pageCount.toLocaleString()} (during ${config.name} list).`, error);
doAbort = true;
break;
}
// ...
}
Conclusion & user feedback
Did I solve my problem? Definitely!
In building this tool, I also empowered others to manage the accounts they follow:
That is a very cool tool! The output is easy to interpret and the instructions work pretty well for allowing me to pick what I want to view in the output
I hope this article helps you to understand the source code and maybe even empower you to build other tools. I am always open to feedback.
Edit (5 Jan 2023): Replaced /${username}/?__a=1
with /${username}/?__a=1&__d=dis
from this
StackOverflow answer.