Ephemera

how to automatically delete your old tweets and toots using your own computer

26 April 2020

There has always been a contradiction in social media technologies between the affordances of platforms in daily use, and the archival capacities available to the curious or belligerent. Users are encouraged and expected to joke, argue, and share every thought and action, in a ways that are often intimate and unguarded. And yet, with the cold logic of responsible system administration and a well-executed disaster recovery plan, every one of these utterances is recorded in an ever-expanding archive of the potentially-embarrassing, regretfully offensive, and highly contextual, mixing in a soup of tedium and TV spoilers. Inevitably, this public archive has been weaponised. The 2018 Victorian state election saw a rolling series of candidate resignations and dis-endorsements from nearly every party, as journalists and political operatives systematically scoured opponents' social media histories going back years, looking for anything potentially compromising. Some of what was found was legitimately revealing. Much of it was maliciously removed from its context, embarrassing but commonplace, or targeted at people who were not even running as candidates. Anyone and everything, it seemed, was considered fair game.

I'd had a sense of deep unease about this public archive of utterances well before the events of 2018, and have no intention of every running for public office, but it proved to be a catalyst for me to finally take action to try to make my ephemeral social media posts actually ephemeral (at least from a public archiving point of view - I have no illusions that Facebook ever really deleted all my data when I deleted my account, or that Twitter really deletes all my old tweets). Since around December 2018 I've been regularly deleting old tweets from Twitter and toots from Mastodon using some Python scripts. I've been asked a few times how this works, and recently made some updates to the Mastodon script, so if you would like to do the same thing, here's a brief guide.

Deleting tweets programatically

API keys

Prior to Twitter's July 2018 API changes it was pretty easy to just create a new 'app' in your Twitter account to obtain the API keys and tokens needed to interact with your account programatically. Unfortunately, in the name of "fighting abuse" (something Twitter and Jack Dorsey have repeatedly shown themselves to be both deeply uninterested in, and comically incompetent at), instead of banning Nazis Twitter instead continued their slow-rolling war on the independent developers who are largely responsible for its success.

Why am I giving you this history lesson? Mostly because I'm still annoyed, but also as background for why there is now a convoluted and opaque system where you have to register your app, fill in a form, wait an unspecified period of time for some tech-bro in San Francisco to 'review' your app, and then maybe get your API keys activated, before you will be able to delete your old tweets using the options below.

To delete old tweets using a script you need to obtain four special strings:

  • consumer_key
  • consumer_secret
  • access_token
  • access_token_secret

These become available by registering for a developer account and app at developer.twitter.com.

Deep cleaning

Before we can move to regularly deleting tweets of a certain age, we need to do the Twitter equivalent of a 'deep clean'. The Twitter Standard API will only go back around 6 months into the past (the exact time seems a little rubbery). To remove tweets older than this, we need to target the exact tweet IDs rather than simply search your tweet history. For this we need:

Depending on how big your archive is, the direction of the wind, and the weather in California, this can take anywhere between an hour and a week. Best to request it as the same time you request your developer account.

Once you have your archive and your API credentials, you can get down to business.

  1. Create a new folder on your PC called tweet_delete
  2. Copy the script into a new file in that folder, called delete_tweets.py.
  3. Open the .zip archive from Twitter. There should be a folder at data/js/tweets. Copy this folder into your tweet_delete folder.
  4. Open delete_tweets.py in a text editor (Notepad or TextEdit will do if you don't have a code editor)
  5. Enter your Twitter credentials and period of time you want keep tweets for, starting at line 21:
consumer_key = 'YOUR CONSUMER KEY HERE' 
consumer_secret = 'YOUR CONSUMER SECRET HERE'
access_token = 'YOUR ACCESS TOKEN HERE'
access_token_secret = 'YOUR TOKEN SECRET HERE'
days_to_keep = 365 # change this number to the number of days you want to keep tweets for - e.g. 28
  1. If you don't already have Python 3 installed, you will need to do that now.
  2. Open up a command line (cmd / Terminal)
  3. Install tweepy - pip install tweepy should do the trick, or you may need pip3 install tweepy.
  4. Line 37 of the script is going to be a problem, because instead of a single file called tweets.js you will have a file for each month of each year. We could loop over each of these, but that will make the whole process take a long time (I had 10 years of tweets to delete, for example). Instead, we'll delete one month at a time, and change the file name for each run:
file = "tweets/2019_03.js"
fp = open(file,"r")
  1. Now you can run the script with python delete_tweets.py

You may find that the script fails on some tweets. At some point Twitter appears to have changed the way they stored dates. So for older tweets you may need to change line 41 (or actually line 42, after you make the change above) from this:

d = datetime.strptime(tweet['created_at'], "%a %b %d %H:%M:%S %z %Y")

To this:

d = datetime.strptime(tweet['created_at'], "%Y-%m-%d %H:%M:%S %z")

As you work your way through your Twitter archive, you'll begin to notice your 'total tweets' number drop - but it probably won't end up matching the number you expect. Because you're deleting tweets using a script, Twitter's way of counting your tweets will take a while to catch up - and may never be correct. Don't worry too much about it.

If you want to keep a record of any errors while you run the script, you can send error messages out to a log file by adding this line at line 39:

errors = open("errors.txt", "w")

And adding this to the except section at the end of the script:

except:
    print("ERROR for - " + tweet['created_at'] + " " + tweet['id_str'])
    errors.write(tweet['id_str'])
    pass

This will create an error logging file called errors.txt listing the ID numbers of any tweets that throw errors when you try to delete them.

Housekeeping

Now that you've "deep cleaned" your Twitter account, you just need to keep doing regular housekeeping. For this job we can use Magnus Nissel's cleantweets script.

You can download the whole repo using git, or just download it as a zip file using the big green button and unzip it into a new directory next to your tweet_delete folder. Magnus's README gives a fairly clear overview of how to set it up. You will need the same API keys you used for the previous script, but this time you put them in a file called settings.ini. You already installed tweepy and Python, so you have everything you need. However there's a problem - you need to run this script regularly for it to be of any use. Ideally you'd run it every day. So how do we do that?

There are basically three options:

  1. On Windows you could use Task Scheduler
  2. On Linux/BSD/MacOS you can use cron
  3. On MacOS you can use launchd to get around a limitation with cron

Windows

I only use Windows for work, and don't really do any coding in my job, so I'm not hugely familiar with how to use Task Scheduler (or even Python, to be honest) on Windows. This outline looks pretty useful though, so if you want to use tweetdeleter on Windows, give it a go.

n*x or MacOS with cron

This is the option suggested in Magnus's README. If you download everything in the repo then you can take advantage of his autorun.sh file, which is just a cleaner way to call the script:

  1. Check for your user cron file with crontab -l
  2. Edit the file: crontab -e
  3. Paste this into a new line: @daily cd ~/cleantweets && ./autorun.sh
  4. Close crontab either with :wq (Vi/Vim) or Ctrl + X followed by y (nano)

Your script will now run 'daily' using cron. If you are using a server, this should be fine. If you're using your personal computer, you have a potential problem: if your machine is not awake, logged in, and connected to the internet at the time your job runs, it will simply fail.

MacOs with launchd

To overcome some of the issues with cron, MacOS allows you to use launchd to schedule scripts. Unlike cron, if you schedule a job for run with launchd and your machine is sleeping when the job is scheduled, it will run when the machine wakes up. This means your script should run every day you use your Mac - but we need to set it up the right way.

The biggest problem I've found with running scripts with launchd is that they will run as soon as the machine is woken up - which is usually a few seconds prior to the network becoming available. With no internet, your tweet deleting script won't be much use. So we need to make some adjustments. Basically, we need to make two new files: a plist file to tell launchd what to do, and a short Python script to retry the main script if there are any network errors. Our Python script won't be the most elegant Python ever written, but it will do the job:

#!/usr/bin/env python3
from datetime import datetime 
import subprocess 
import time

print('Running at ' + str(datetime.now() ))
def deleteTweets(retry_count=0):
  try:
    subprocess.run(
      ["python3 cleantweets.py --delete --config 'settings.ini'"]
    )
  except Exception:
    # probably there is a network error
    if retry_count < 4:
        print('Waiting 1 minute before trying again')
        time.sleep(60)
        retry_count += 1
        print( 'Attempt ' + str(retry_count + 1) )
        deleteTweets(retry_count)
    else:
        print('Gave up trying after 5 attempts')

deleteTweets()

Save this into a new file called schedule_cleantweets.py, in the same directory as the other cleantweets files. This is the file that will be called by your plist file, which is what we will now create. The plist file tells launchd what to do and when to do it. It looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
	<dict>
		<key>Label</key>
		<string>cleantweets.scheduler</string>
		<key>WorkingDirectory</key>
		<string>YOUR_FILEPATH_HERE</string>
		<key>ProgramArguments</key>
		<array>
			<string>PATH_TO_PYTHON_EXECUTABLE</string>
			<string>YOUR_FILEPATH_HERE/schedule_cleantweets.py</string>
		</array>
		<key>StandardOutPath</key>
		<string>cleantweets.scheduler.log</string>
		<key>StandardErrorPath</key>
		<string>cleantweets.scheduler.error.log</string>
		<key>StartCalendarInterval</key>
		<dict>
			<key>Hour</key>
			<integer>9</integer>
			<key>Minute</key>
			<integer>00</integer>
		</dict>
	</dict>
</plist>

You will need to adjust a few lines:

Replace YOUR_FILEPATH_HERE with the full path to the directory with your files. e.g. /Users/hugh/cleantweets.

Replace PATH_TO_PYTHON_EXECUTABLE with (surprise!) the path to your Python executable. Where this is may depend on how you installed Python. The first thing to do is work out whether python calls Python 2 or Python 3. Unless your Mac is quite new, it probably calls the old version. You can check this with:

python --version

You will probably get something like Python 2.7.16 in return. If the version returned is Python 3.x, then you need to use python when calling scripts. Otherwise you need to use python3. If you're still not sure, python3 is probably correct. Now we need to find where the executable is. Run this command:

which python3

You should get something like:

/usr/local/bin/python3

Replace PATH_TO_PYTHON_EXECUTABLE in your plist file with this string.

Lastly, you may want to change the time the script is scheduled to run (bearing in mind that it will run when you open your laptop if it was asleep at the sheduled time). Change these lines if you want. launchd uses 24 hour time, so 1.30pm would be hour 13 and minute 30:

<key>Hour</key>
<integer>9</integer>
<key>Minute</key>
<integer>00</integer>

When you're ready:

  1. Save this file as cleantweets.scheduler.plist
  2. Copy the file to ~/Library/LaunchAgents/, where launchd expects it:
cp cleantweets.scheduler.plist  ~/Library/LaunchAgents/
  1. Load it by running:
launchctl load ~/Library/LaunchAgents/cleantweets.scheduler.plist

You can check whether it has been running properly by looking at the cleantweets.scheduler.log and cleantweets.scheduler.error.log files that will be created the first time it runs.

Deleting toots programatically

If you're using Mastodon and want to regularly delete your toots, the good news is that it is a much simpler process that deleting old tweets. The primary reason for this is that Mastodon is open source software, so even though there are dozens or even hundreds of different servers in the Mastodon universe, it's quite transparent how the deletion API works and there are no vague time limits, so we don't need to do a separate deep clean like with Twitter.

In late 2018 I used Francois's tweet deleting script as a basis for a toot-deleting script I called ephemetoot. A couple of people have contributed code, and it now includes an automated option for creating and loading a plist file so you don't have to undergo all the faffing about we did for Twitter.

API key (access token)

To get your access token, just go to Settings - development - New Application and create an app name, giving the app full Read and Write scopes. Save, click on the URL for your new app, and copy the access token string.

Housekeeping

To delete toots regularly we can use ephemetoot.

  1. Get the code either with git or by downloading the zip file
  2. Install with pip by running the following inside your ephemetoot directory:
pip3 install .
  1. Set up your configuration file - this is where you enter your access token

To run the script you have the same options as with Twitter: set up your own scheduling script on Windows, use cron on Linux, or use launchd in MacOS.

cron for mastodon

Per the instructions in the ephemetoot README, you can set up cron relatively simply just like with cleantweets:

  1. crontab -e
  2. enter a new line:
@daily ephemetoot --config /path/to/ephemetoot/config.yaml
  1. exit with :qw (Vi/Vim) or Ctrl + x (nano)

launchd for mastodon

With ephemetoot setting up launchd scheduling is much simpler than for Twitter, because I built it in to the script. You can just use flags:

ephemetoot --schedule [directory] --time hour minute

So from within the ephemetoot directory you could schedule it to run at 7:30pm every day by running:

ephemetoot --schedule --time 19 30

As with our previous example, you will see a log file and an error.log file after it first runs, so you can keep an eye on any issues.

So there you go. To make your tweets and toots actually ephemeral you don't need to sign up to an external service, nor run your own server. With a few Python scripts and some scheduling tools you can take care of it from your own computer!