Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New archive, new format. #11

Open
Zefling opened this issue Nov 4, 2018 · 7 comments
Open

New archive, new format. #11

Zefling opened this issue Nov 4, 2018 · 7 comments

Comments

@Zefling
Copy link

Zefling commented Nov 4, 2018

The new archive contains all medias (except Youtube videos) and avatar, without interface.
This project don't works with the new format.

I make a project to read the recreate the interface : https://git.ikilote.net/angular/twitter-archive
I will search a solution for download avatar and created a avatar.js

@kfogel
Copy link

kfogel commented Aug 16, 2019

Hey, @mwichary. I also found (as of today), that Twitter is using a new archive format. For one thing, there's no index.html anywhere in the download. The new downloads include a README.txt that says:

This archive consists of machine-readable JSON files containing
information associated with your account. We’ve included the
information we believe is most relevant and useful to you, including
your profile information, your Tweets, your DMs, your Moments, your
media (images, videos and GIFs you’ve attached to Tweets, DMs, or
Moments), a list of your followers, a list of accounts following
you, your address book, Lists that you’ve created, are a member of,
or are subscribed to, interest and demographic information that we
have inferred about you, information about ads that you’ve seen or
engaged with on Twitter, and more.

So maybe some stuff that used to not be included now is included?

Also, I'm not sure this is related to the new format, but I had to do this patch in order to get the script to even get to the point of looking for index.html. Not only does the native archive not have an img/avatars/ directory, it doesn't even have an img/ directory! Maybe it used to? Anyway, this change is probably a good idea either way:

--- twitter-export-image-fill.py
+++ twitter-export-image-fill.py
@@ -111,7 +111,7 @@ def load_tweet_index():
 
 def make_directory_if_needed(directory_path):
   if not os.path.isdir(directory_path):
-    os.mkdir(directory_path)
+    os.makedirs(directory_path)
 
 
 def is_retweet(tweet):

@mwichary
Copy link
Owner

@kfogel Thanks for letting me know. I’ll try to check out the new format.

@Zefling
Copy link
Author

Zefling commented Aug 18, 2019

Now, Twitter offers 2 archives. One with an HTML formatting without any media and the other with only in Json with the media (pictures and movies). The link I give is for the second format which allows to find a data formatting.
Example with data of my account: http://twitter.ikilote.net/tweets

@kfogel
Copy link

kfogel commented Aug 18, 2019

Interesting observation from @Zefling. FWIW, Twitter only offered me the second kind of archive -- there are no .html files in it, but it does have media files. There was no point at which I chose this: it was just the default download offered to me, and there was no option for any other kind.

@mwichary
Copy link
Owner

mwichary commented Aug 18, 2019 via email

@keithrbennett
Copy link

Hi, I'm wondering if there may be any plans to update this script to run with the current format? Or if anyone knows of other tools for this?

Some things I noticed about the new (November 2022) format:

  • the main HTML file is now named 'Your archive.html'.
  • the actual tweets are in a file named data/tweets.js. This file is basically a JSON file with a JavaScript assignment wrapping it.

@kfogel
Copy link

kfogel commented Nov 26, 2022

...Or if anyone knows of other tools for this?

@keithrbennett You may want to also look at https://github.com/timhutton/twitter-archive-parser/, and at the long list of other tools given at the end of its README.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants