{{Short description|Web archive}} {{Lowercase title}} {{Distinguish|Internet Archive}} {{Self-reference|For guidance on the resolution of links to archive.today or its mirrors, see Wikipedia:archive.today guidance}} {{Bots|deny=FrescoBot,AWB}} {{Use dmy dates|date=January 2026}} {{Infobox website | name = archive.today | logo = archive.today logo with subtitle.svg{{!}}class=skin-invert | screenshot = File:Archive.today Screenshot - 12.19.2024.png | screenshot_size = 300px | caption = Screenshot of the archive.today home page | url = {{Plainlist| * archive.today * archive.fo * archive.is * archive.li * archive.md * archive.ph * archive.vn * {{Onion URL|archiveiya74codqgiixo33q62qlrqtkgmcitqx5u2oeqnmn5bpcbiyd}}<ref group=note>{{Cite tweet|user=archiveis|number=1189322374598053890|title=a current list of all tor domains and clear net domains|date=30 October 2019}}</ref> }} | type = Web archiving | language = Multilingual | registration = No | launch_date = {{Start date and age|2012|5|16|df=y}}<ref group=note>{{Cite web|url=https://blog.archive.today/post/77015559437/when-did-the-archive-is-site-originally-launch|title=When did the Archive-is site originally launch?|website=Archive.today Blog|via=Tumblr|date=18 February 2014|access-date=10 April 2021|archive-date=2025-12-30|archive-url=https://web.archive.org/web/20251230195406/https://blog.archive.today/post/77015559437/when-did-the-archive-is-site-originally-launch|url-status=live}}</ref> }}
'''archive.today''' (also known as '''archive.is''', among other domains) is a web archiving website that saves snapshots on demand. It has support for JavaScript-heavy sites such as Google Maps and X.<ref>{{cite web|last1=Brinkmann|first1=Martin|date=22 April 2015|title=Create publicly available web page archives with Archive.is|url=https://www.ghacks.net/2015/04/22/create-publicly-available-web-page-archives-with-archive-is/|url-status=live|archive-url=https://web.archive.org/web/20190412072055/https://www.ghacks.net/2015/04/22/create-publicly-available-web-page-archives-with-archive-is/|archive-date=12 April 2019|access-date=13 June 2015|website=Ghacks}}</ref> archive.today records two snapshots: one replicates the original webpage including any functional live links; the other is a screenshot of the page.<ref>{{cite journal|last1=Brunelle|first1=Justin F.|last2=Kelly|first2=Mat|last3=Weigle|first3=Michele C.|last4=Nelson|first4=Michael L.|date=25 January 2015|title=The impact of JavaScript on archivability|url=https://www.cs.odu.edu/~mweigle/papers/brunelle-ijdl16.pdf|url-status=live|journal=International Journal on Digital Libraries|volume=17|issue=2|pages=95–117|doi=10.1007/s00799-015-0140-8|s2cid=8433375|archive-url=https://web.archive.org/web/20190527064810/https://www.cs.odu.edu/~mweigle/papers/brunelle-ijdl16.pdf|archive-date=27 May 2019}}</ref>
The website has come under scrutiny from many governments starting in the late 2010s, including bans in China and Russia and in 2025 U.S. Federal Bureau of Investigation (FBI) subpoenaed a domain registrar to identify the owner of archive.today's domain name.
== History == archive.today was founded in 2012 as a web archive. It allegedly registered its trademark in the Czech Republic in 2013.<ref name="WM">{{cite news |last=McCurdy |first=Will |title=Wikipedia Blacklists Archive.today Links Over Alleged DDoS Attack on Blogger |work=PC Magazine |date=21 February 2026 |accessdate=2 March 2026 |url=https://www.pcmag.com/news/wikipedia-blacklists-archiveis-after-alleged-ddos-attack-on-blogger |archive-url=https://web.archive.org/web/20260221155023/https://www.pcmag.com/news/wikipedia-blacklists-archiveis-after-alleged-ddos-attack-on-blogger |archive-date=21 February 2026 |url-status=live}}</ref> The site originally branded itself as archive.today, but changed the primary mirror to archive.is in May 2015.<ref group=note>{{cite web|url=https://blog.archive.is/post/118010496181/why-did-you-change-the-url-back-from-archive-today|title=Why did you change the URL back from archive-today to archive-is?|work=Archive.is Blog|date=3 May 2015|archive-url=https://web.archive.org/web/20150601001607/https://blog.archive.is/post/118010496181/why-did-you-change-the-url-back-from-archive-today|archive-date=1 June 2015|url-status=live|access-date=6 January 2019}}</ref> It began to deprecate the archive.is domain in favor of other mirrors in January 2019.<ref group=note>{{cite tweet |user=archiveis |number=1081276424781287427 |title=Please do not use archive.IS mirror for linking, use others mirrors [.TODAY .FO .LI .VN .MD .PH]. .IS might stop working soon.|date=4 January 2019|archive-url=https://web.archive.org/web/20190106000101/https://twitter.com/archiveis/status/1081276424781287427|archive-date=6 January 2019|url-status=live}}</ref> According to the archive.today blog, the website had saved about 500 million pages by 2021,<ref group=note>{{Cite news |date=2021-09-03 |title=What percentage of 5-char-codes is used now? [...] |url=https://archive-is.tumblr.com/post/661314951417315328/what-percentage-of-5-char-codes-is-used-now-full |archive-url=https://web.archive.org/web/20260129230947/https://archive-is.tumblr.com/post/661314951417315328/what-percentage-of-5-char-codes-is-used-now-full |archive-date=29 January 2026 |access-date=2026-02-11 |work=Archive.is blog |publisher=Tumblr |url-status=live }}</ref> 700 terabytes in total size.<ref name="AC">{{cite news |last=Cuesta |first=Albert |title=L'FBI, a la caça del web arxivat que incomoda els mitjans |work=Ara |date=15 November 2025 |accessdate=2 March 2026 |url=https://www.ara.cat/media/l-fbi-caca-web-arxivat-incomoda-mitjans_1_5561784.html |archive-url=https://web.archive.org/web/20251117060008/https://www.ara.cat/media/l-fbi-caca-web-arxivat-incomoda-mitjans_1_5561784.html |archive-date=17 November 2025 |url-status=live |lang=ca}}</ref>
In July 2013, archive.today began supporting the API of the Memento Project at Los Alamos National Laboratory.<ref>{{cite web|last1=Nelson|first1=Michael L.|url=https://ws-dl.blogspot.nl/2013/07/2013-07-09-archiveis-supports-memento.html|title=Archive.is Supports Memento|publisher=Web Science and Digital Libraries Research Group at Old Dominion University|work=Research and Teaching Updates|date=9 July 2013|archive-url=https://web.archive.org/web/20130727194715/https://ws-dl.blogspot.de/2013/07/2013-07-09-archiveis-supports-memento.html|archive-date=27 July 2013|url-status=live|access-date=17 September 2013|language=en}}</ref><ref>{{cite web |title=archive.is |url=https://mementoweb.org/depot/native/archiveis/ |archive-url=https://web.archive.org/web/20130915191950/https://mementoweb.org/depot/native/archiveis/ |archive-date=15 September 2013 |access-date=17 September 2013 |website=Memento Protocol Information |publisher=Memento Development Group}}</ref> Due to budget constraints at LANL, the Memento Project was disestablished in September 2025.<ref>{{cite mailing list |url=https://groups.google.com/g/memento-dev/c/Tr_xBWzBr0Y |title=Memento TimeTravel sunset |date=2025-08-07 |access-date=2025-11-21 |mailing-list=memento-dev |last=Taylor |first=Nicholas |archive-url=https://web.archive.org/web/20250818150526/https://groups.google.com/g/memento-dev/c/Tr_xBWzBr0Y |archive-date=2025-08-18 |url-status=live}}</ref> Archive.today was one of the last major active users of the Memento protocol following the project's downsizing.<ref name="Memento-sunset">{{cite web |last1=Taylor |first1=Nicholas |date=7 August 2025 |title=Memento TimeTravel sunset |url=https://groups.google.com/g/memento-dev/c/Tr_xBWzBr0Y |publisher=Memento Development Group |access-date=27 April 2026 |archive-date=6 December 2025 |archive-url=https://web.archive.org/web/20251206013048/https://groups.google.com/g/memento-dev/c/Tr_xBWzBr0Y |url-status=live }}</ref> The closure of the Memento infrastructure at LANL in September 2025 came amid a broader period of increased scrutiny for the service.
The Russian independent media outlet ''Mediazona'' uses the site to preserve social media profiles and posts of Russian servicemen killed in the Russo-Ukrainian war, as part of its Russia 200 project, a named database of confirmed Russian military casualties compiled jointly with the BBC Russian Service and a team of volunteers.<ref>{{cite web |title=Russian losses in the war with Ukraine. Mediazona count, updated |url=https://en.zona.media/article/2022/05/11/casualties_eng |website=Mediazona |access-date=27 April 2026 |archive-date=11 June 2024 |archive-url=https://web.archive.org/web/20240611105305/https://en.zona.media/article/2022/05/11/casualties_eng |url-status=live }}</ref> Individual profile pages on 200.zona.media link to snapshots of social media posts by relatives, obituaries in local media, and other open-source evidence used to verify each death.
In early 2023, a team of researchers at the University of Amsterdam identified archive.today as the most-used open-access archiving service among fact-checking organisations, based on the {{ill|European Digital Media Observatory|it|Osservatorio europeo dei media digitali}}'s dataset on the Russo-Ukrainian war.<ref>{{cite news |title=Losing our memory of fake news |work=Community Research and Development Information Service |date=24 February 2023 |accessdate=2 March 2026 |url=https://cordis.europa.eu/article/id/442985-losing-our-memory-of-fake-news |archive-url=https://web.archive.org/web/20251209075049/https://cordis.europa.eu/article/id/442985-losing-our-memory-of-fake-news |archive-date=9 December 2025 |url-status=live}}</ref><ref>{{cite web |last=Porcellini |first=Valentin |title=Mapping the "memory loss" of disinformation in fact-checks: the challenge of preserving disinformation traces |website=vera.ai |accessdate=2 March 2026 |url=https://www.veraai.eu/posts/mapping-memory-loss-archiving-dmi-winter-school |archive-url=https://web.archive.org/web/20230126093904/https://www.veraai.eu/posts/mapping-memory-loss-archiving-dmi-winter-school |archive-date=26 January 2023 |url-status=live}}</ref>
In August 2023, Jani Patokallio published an investigation on his blog ''Gyrovague'' regarding archive.today's funding sources and the founder's identity.<ref name="jani1" /> The founder tends to believe that the Patokallio family’s increased interest is linked to how the archive.today website is being used in the context of the Russian-Ukrainian war.<ref group=note>{{cite web |title=Ladies and gentlemen, [...] |work=Blog of http:/<span></span>/archive.today/ |date=January 26, 2026 |url=https://lj.rossia.org/users/archive_today/615.html}}</ref>
On 30 October 2025, the US Federal Bureau of Investigation (FBI) subpoenaed archive.today's domain registrar, Tucows. The subpoena stated its purpose was to identify the owner(s) of the archive.today domain name, and that it was part of a criminal investigation conducted by the FBI, the nature of which was not disclosed.<ref name="Koebler">{{cite web |last1=Koebler |first1=Jason |title=FBI Tries to Unmask Owner of Infamous Archive.is Site |url=https://www.404media.co/fbi-tries-to-unmask-owner-of-infamous-archive-is-site/ |website=404 Media |access-date=6 November 2025 |archive-url=https://web.archive.org/web/20251106150129/https://www.404media.co/fbi-tries-to-unmask-owner-of-infamous-archive-is-site/ |archive-date=6 November 2025 |url-status=dead}}</ref><ref>{{cite news | last=Kirchner | first=Malte | title=Archive.today: FBI Demands Data from Provider Tucows | date=5 November 2025 | url=https://www.heise.de/en/news/Archive-today-FBI-Demands-Data-from-Provider-Tucows-11066346.html | work=heise.de }}</ref> The Catalan daily ''Ara'' interpreted the action as part of a campaign to selectively criminalize anonymous digital archives reliant on micro-donations (such as Anna's Archive, eliminated by Google from its search results), even though industrial datasets used for training large language models (such as the Common Crawl, financed by OpenAI and Anthropic) also fail to compensate content creators and owners.<ref name="AC" /> News coverage of the subpoena mentioned Patokallio's report, where Patokallio has said there were "several indications" the founder was based in Russia.<ref name="jani1" /> An independent investigation conducted by Swiss attorney Martin Steiger reached similar conclusions.<ref>{{cite web |last1=Steiger |first1=Martin |author-link=Martin Steiger |date=5 May 2024 |title=«archive.today» liefert Daten von Website-Besuchern nach Russland |url=https://steigerlegal.ch/2024/05/05/archive-today-russland/ |work=Steiger Legal |language=de |access-date=27 April 2026 |archive-date=29 April 2026 |archive-url=https://web.archive.org/web/20260429083825/https://steigerlegal.ch/2024/05/05/archive-today-russland/ |url-status=live }}</ref>
In November 2025, the DNS provider AdGuard DNS reported that it had been pressured by a French organization calling itself ''Web Abuse Association Defense'' (WAAD) to block archive.today and its mirror domains. WAAD alleged that archive.today had refused to remove child sexual abuse material since 2023, invoking French LCEN law to demand action. AdGuard DNS contacted archive.today directly and reported that the flagged content was promptly removed upon notification, and that archive.today stated it had never received prior complaints about those URLs. AdGuard's investigation found that WAAD was a recently registered association with minimal public presence, and described the complaints as suspicious, noting evidence of possible impersonation of a real French lawyer in prior similar complaints sent to other companies. AdGuard announced it would file a criminal complaint with French police.<ref>{{cite web |url=https://adguard-dns.io/en/blog/archive-today-adguard-dns-block-demand.html |title=Behind the complaints: Our investigation into the suspicious pressure on Archive.today |website=AdGuard DNS Blog |date=November 13, 2025 |access-date=April 27, 2026 |author=Meshkov, Andrey}}</ref><ref>{{cite web |url=https://gigazine.net/gsc_news/en/20251117-suspicious-pressure-on-archive-today/ |title=AdGuard DNS publishes investigation results revealing that the organization pressuring Archive.today is highly suspicious |website=GIGAZINE |date=November 17, 2025 |access-date=April 27, 2026}}</ref>
thumb|300px|Screenshot of archive.today performing a DDoS attack on gyrovague.com|alt=Screenshot of LibreWolf's developer tools on the "Network" tab, with multiple automated connections to "gyrovague.com" made by a JavaScript script (all of them are blocked by uBlock Origin browser extension)
On 8 January 2026, Patokallio's hosting provider Automattic notified him that it had received a GDPR complaint from a person identifying herself as "Nora". The complaint alleged that the 2023 ''Gyrovague'' investigation "contains extensive personal data… presented in a narrative that is defamatory in tone and context." After Patokallio submitted a rebuttal, Automattic sided with him and left the post up.<ref name="Ars-DDoS">{{cite news |last1=Brodkin |first1=Jon |date=10 February 2026 |title=Archive.today CAPTCHA page executes DDoS; Wikipedia considers banning site |url=https://arstechnica.com/tech-policy/2026/02/wikipedia-might-blacklist-archive-today-after-site-maintainer-ddosed-a-blog/ |work=Ars Technica |access-date=27 April 2026 |archive-date=10 February 2026 |archive-url=https://web.archive.org/web/20260210203739/https://arstechnica.com/tech-policy/2026/02/wikipedia-might-blacklist-archive-today-after-site-maintainer-ddosed-a-blog/ |url-status=live }}</ref> Subsequent investigation suggested that "Nora" was likely an appropriated identity—the name belonged to either a real person or a trademark of a clothing brand, whose only connection to archive.today had been a prior content takedown request.<ref name=JB/><ref name="Ars-DDoS"/> On 10 January 2026, the archive.today webmaster sent Patokallio an email asking him to temporarily remove the 2023 post; when Patokallio declined, the DDoS attack began several days later.<ref name="Ars-DDoS"/>
On 14 January 2026, it emerged that archive.today had modified its CAPTCHA page to discretely send repeated requests to ''Gyrovague'', thereby causing visitors to unwittingly contribute to a DDOS attack against the blog. A Tumblr account seemingly associated with archive.today had recently posted several public criticisms of Patokallio. Emails released by Patokallio show archive.today requesting the temporary removal of his report and later threatening him with AI pornography.<ref name="jani1">{{Cite web |last=Brodkin |first=Jon |date=February 10, 2026 |title=Archive.today CAPTCHA page executes DDoS; Wikipedia considers banning site |url=https://arstechnica.com/tech-policy/2026/02/wikipedia-might-blacklist-archive-today-after-site-maintainer-ddosed-a-blog/ |access-date=2026-02-11 |website=Ars Technica |language=en |archive-date=10 February 2026 |archive-url=https://web.archive.org/web/20260210203739/https://arstechnica.com/tech-policy/2026/02/wikipedia-might-blacklist-archive-today-after-site-maintainer-ddosed-a-blog/ |url-status=live }}</ref> On 20 February 2026, the English Wikipedia banned links to archive.today, citing the DDoS attack and evidence that archived content was tampered with to insert Patokallio's name.<ref name="JB">{{Cite web |last=Brodkin |first=Jon |date=2026-02-20 |title=Wikipedia blacklists Archive.today, starts removing 695,000 archive links |url=https://arstechnica.com/tech-policy/2026/02/wikipedia-bans-archive-today-after-site-executed-ddos-and-altered-web-captures/ |access-date=2026-02-20 |website=Ars Technica |language=en |archive-date=20 February 2026 |archive-url=https://web.archive.org/web/20260220200935/https://arstechnica.com/tech-policy/2026/02/wikipedia-bans-archive-today-after-site-executed-ddos-and-altered-web-captures/ |url-status=live }}</ref> The decision was made despite concerns over maintaining content verifiability<ref name="JB" /> while removing and replacing the second-largest archiving service used across the Wikimedia Foundation's projects.<ref name="ML">{{cite news |last=Lewczuk |first=Maciej |title=Archive.today zamienił użytkowników w nieświadomych hakerów. Wikipedia reaguje na atak DDoS |work=PurePC |date=11 February 2026 |accessdate=2 March 2026 |url=https://www.purepc.pl/archive-today-wikipedia-atak-ddos-blog-hakerzy-fbi-linki-do-archiwow-tozsamosc-administratora |archive-url=https://web.archive.org/web/20260211143942/https://www.purepc.pl/archive-today-wikipedia-atak-ddos-blog-hakerzy-fbi-linki-do-archiwow-tozsamosc-administratora |archive-date=11 February 2026 |url-status=live |lang=pl}}</ref> The Wikimedia Foundation had stated its readiness to take action regardless of the community verdict.<ref name="JB" /><ref name="ML" /> Patokallio expressed his satisfaction with the outcome.<ref name="WM" />
During the community discussion, editors discovered that archive.today's operator had tampered with archived snapshots of webpages. In captures of a blog post related to the "Nora" pseudonym, the operator had replaced instances of "Nora" with "Jani Patokallio", including in comment fields that previously read "Comment as: Nora [surname]".<ref name=JB/> The alterations were subsequently reverted. The discovery was cited as a key factor in the blacklisting decision, as it undermined the premise that archived snapshots were faithful reproductions of the original pages.<ref name=JB/>
This was not the first time Wikipedia had restricted links to archive.today. In 2013, the community blacklisted archive.is, citing concerns about botnets, linkspamming, and the opaque manner in which the site was operated. The decision was overturned in 2016 following a new request for comment, and archive.today was removed from the spam blacklist. At the time of the 2026 ban, the site was the second-largest archiving service used across all Wikimedia Foundation projects, with over 695,000 links spread across approximately 400,000 pages.<ref name=JB/>
== Features == {{Primary sources section|date=July 2022}}
=== Archiving === archive.today can capture individual pages in response to explicit user requests.<ref>{{cite web|last1=Dascalescu|first1=Dan|author-link=Dan Dascalescu|url=https://wiki.dandascalescu.com/reviews/online_services/web_page_archiving|title=Web page archiving |work=Dan Dascalescu's Wiki |date=18 February 2013|access-date=3 October 2013|archive-url=https://web.archive.org/web/20130922192354/https://wiki.dandascalescu.com/reviews/online_services/web_page_archiving|archive-date=22 September 2013|url-status=dead}}</ref><ref>{{cite web|last1=Koebler|first1=Jason|url=https://www.vice.com/en/article/dear-gamergate-please-stop-stealing-our-shit/|title=Dear GamerGate: Please Stop Stealing Our Shit|date=29 October 2014|work=Motherboard|archive-url=https://web.archive.org/web/20260201012858/https://www.vice.com/en/article/dear-gamergate-please-stop-stealing-our-shit/|archive-date=2026-02-01|url-status=live|quote=There is no way for a website to protect itself from having an Archive.today user mirror the site.|access-date=22 March 2017}}</ref><ref group=note name="FAQ">{{cite web|title=Archive.today FAQ|url=https://archive.today/faq|website=archive.today|access-date=15 February 2019|language=en}}</ref> Since its beginning, it has supported crawling pages with URLs containing the now-deprecated hash-bang fragment ({{mono|#!}}).<ref group=note>{{cite web|url=https://archive.is/|title=Home page of Archive.is in 2013|archive-url=https://web.archive.org/web/20130112221411/https://archive.is/|archive-date=12 January 2013|url-status=dead}}</ref> The website records only text and images, excluding XML, RTF, spreadsheet (xls or ods) and other non-static content. However, videos for certain sites, like Twitter, are saved.<ref group=note>{{cite web|title=Have you considered allowing small mp4s [...]|work=Archive.is blog|url=https://blog.archive.today/post/657607767402659840/have-you-considered-allowing-small-mp4s-and-webms|url-status=dead|archive-url=https://web.archive.org/web/20210907153549/https://blog.archive.today/post/657607767402659840/have-you-considered-allowing-small-mp4s-and-webms|archive-date=7 September 2021}}</ref> It keeps track of the history of snapshots saved, requesting confirmation before adding a new snapshot of an already saved page.<ref name="Occhipinti">{{Citation|last=Occhipinti|first=Kris|title=Archiving Websites with the Archive.is| date=15 April 2016 |via=YouTube |url=https://www.youtube.com/watch?v=LK_bp9_ZyQs|language=en|access-date=27 January 2022|archive-date=27 January 2022|archive-url=https://web.archive.org/web/20220127171624/https://www.youtube.com/watch?v=LK_bp9_ZyQs|url-status=live}}</ref><ref group=note>{{cite web|url=https://archive.today/https://support.google.com/webmasters/answer/6062608?hl=en|title=Example snapshot history on archive.is}}{{cbignore}}</ref> Once a web page is archived, it cannot be deleted directly by any Internet user.<ref group=note>{{cite web |date=24 January 2013 |title=Some Frequently Asked Question |url=https://blog.archive.is/post/41395737942/how-can-i-delete-an-archived-page |url-status=live |archive-url=https://web.archive.org/web/20130926093655/https://blog.archive.is/post/41395737942/how-can-i-delete-an-archived-page |archive-date=26 September 2013 |access-date=12 November 2018 |website=Archive.today Blog |via=Tumblr}}</ref> Users can download archived pages as a ZIP file, except pages archived {{as of|2019|11|29|since=y|post=,|lc=y}}<ref group=note>{{cite web |date=17 July 2020 |title=The "download zip" button has been giving a "Not found" error for quite some time. |url=https://blog.archive.today/post/623883809154383872/the-download-zip-button-has-been-giving-a-not |url-status=live |archive-url=https://web.archive.org/web/20201003125618/https://blog.archive.today/post/623883809154383872/the-download-zip-button-has-been-giving-a-not |archive-date=3 October 2020 |website=Archive.is blog}}</ref> when archive.today changed their browser engine from PhantomJS to Chromium (non-headless).<ref group=note>{{cite web |date=20 May 2020 |title=What scraper or headless browser are you using? it works so well. |url=https://blog.archive.today/post/618635148292964352/what-scraper-or-headless-browser-are-you-using-it |url-status=live |archive-url=https://web.archive.org/web/20200521161738/https://blog.archive.today/post/618635148292964352/what-scraper-or-headless-browser-are-you-using-it |archive-date=21 May 2020 |accessdate=14 February 2025 |website=Archive.is blog}}</ref> archive.today does not obey robots.txt because it acts "as a direct agent of the human user."<ref group=note name="FAQ" />
Pages are captured at a browser width of 1,024 pixels. CSS is converted to inline CSS, removing responsive web design and selectors such as <code>:hover</code> and <code>:active</code>. Content generated using JavaScript during the crawling process appears in a frozen state.<ref group=note>JavaScript-generated loading animation of Dailymotion video https://archive.today/20200121182128/https://www.dailymotion.com/video/x3sexy8 appearing in a frozen state</ref> HTML class names are preserved inside the <code>old-class</code> attribute. When text is selected, a JavaScript applet generates a URL fragment seen in the browser's address bar that automatically highlights that portion of the text when visited again.{{Citation needed|date=February 2026}} Web pages can be duplicated from archive.today to web.archive.org as second-level backup, but archive.today does not save its snapshots in WARC format. The reverse—from web.archive.org to archive.today—is also possible,<ref group="note">See, for example, [https://archive.today/20190324174341/https://web.archive.org/web/20130520191911/https://es.wikipedia.org/wiki/Wikipedia], an archive.today save of an Internet Archive snapshot of Spanish Wikipedia's home page.</ref><!--this is an example of the service, probably not worth removing under the deprecation--> but the copy usually takes more time than a direct capture.
thumb|Archive of a Wikipedia webpage by archive.today on 5 January, 2026
While saving a page, a list of URLs for individual page elements and their content sizes, HTTP statuses and MIME types is shown. This list can only be viewed during the crawling process.{{fact|date=January 2025}} Removing advertisements, popups or expanding links from archived pages is possible by asking the owner to do it on his blog.<ref group=note>{{Cite web |title=Example user request on the Archive.is blog |url=https://blog.archive.today/post/677427547064156160/could-you-expand-17wus-and-links-under-same |url-status=live |archive-url=https://web.archive.org/web/20220429215629/https://blog.archive.today/post/677427547064156160/could-you-expand-17wus-and-links-under-same |archive-date=29 April 2022 |access-date=7 April 2022 |website=Archive.is blog}}</ref>
According to the site's FAQ, archive.today's storage layer runs on Apache Hadoop and Apache Accumulo, with all data stored on the Hadoop Distributed File System (HDFS). Textual content is replicated three times across servers in two data centers, both located in Europe, with at least one hosted by the French provider OVH; images are replicated twice.<ref group=note name="FAQ"/> The site does not store snapshots in WARC format.<ref group=note name="FAQ"/>
The scraping component has used a modified version of the Chromium browser since November 2019, replacing the previous PhantomJS-based engine.<ref group=note name="FAQ"/>
=== Search === The research toolbar enables advanced keywords operators, using {{code|*}} as the wildcard character. Paired quotation marks address the search to an exact sequence of keywords present in the title or in the body of the webpage, whereas the ''insite'' operator restricts it to a specific Internet domain.<ref group=note>For example, the string insite: <nowiki>https://en.wikipedia.org</nowiki> "World Cup" returns the https://archive.today/search/?q=insite%3A+http%3Aen.wikipedia.org+ "World+Cup"/ related snapshots</ref> While saving a dynamic list, archive.today search box shows only a result that links the previous and the following section of the list (e.g. 20 links for page).<ref group=note>Example of dynamic list: {{cite web |title=au:"thomas aquinas" |url=https://www.worldcat.org/search?q=au%3A%22thomas+aquinas%22&fq=&dblist=638&start=21&qt=page_number_link |url-status=live |archive-url=https://web.archive.org/web/20190323131756/https://www.worldcat.org/search?q=au%3A%22thomas+aquinas%22&fq=&dblist=638&start=21&qt=page_number_link |archive-date=23 March 2019 |access-date=15 December 2018 |website=WorldCat}}</ref> The other web pages saved are filtered, and sometimes may be found by one of their occurrences.<ref name="Occhipinti" />{{clarify|date=July 2022}} <!--seems to be talking about pagination?--> The search feature is backed by Google CustomSearch. If it delivers no results, archive.today attempts to utilize Yandex Search.<ref group=note>{{Cite web |date=18 January 2022 |title=Just realized that I can search for keywords in the search bar for archive today, was this a recently added feature? |url=https://blog.archive.today/post/673695282217762816/just-realized-that-i-can-search-for-keywords-in |url-status=live |archive-url=https://web.archive.org/web/20220127183557/https://blog.archive.today/post/673695282217762816/just-realized-that-i-can-search-for-keywords-in |archive-date=27 January 2022 |access-date=27 January 2022 |website=Archive.is}}</ref>
=== Bypassing paywalls ===
archive.today is frequently used to bypass paywalls on news websites, similarly to the defunct service 12ft.<ref>{{cite news |title=12ft Ladder Stopped Working? Here Are the 5 Best Alternatives |url=https://techpp.com/2025/09/01/best-12ft-ladder-alternatives/ |access-date=27 April 2026 |archive-date=30 April 2026 |archive-url=https://web.archive.org/web/20260430124444/https://techpp.com/2025/09/01/best-12ft-ladder-alternatives/ |url-status=live }}</ref>
==== Legal and ethical debate ====
The practice of sharing archive.today links to circumvent paywalls has sparked legal and ethical debate in Europe. In the Netherlands, journalist Peter Aanzee publicly challenged a physician who shared an archive.ph link to one of his paywalled articles in ''De Volkskrant'', arguing that distributing archived copies constituted copyright infringement.<ref name="Netkwesties">{{cite web |last1=Aanzee |first1=Peter |date=7 September 2025 |title=Is het gebruik van Archive.today illegaal? |url=https://www.netkwesties.nl/2155/is-gebruik-archive-today-illegaal.htm |work=Netkwesties |language=nl |access-date=27 April 2026}}</ref> The discussion drew on European Court of Justice jurisprudence on hyperlinking, particularly the 2016 ''GS Media v Sanoma'' ruling, which established that linking to illegally published content can constitute a copyright violation if the linker knew or ought to have known of the illegality—a presumption that applies automatically to parties acting for profit.<ref name="Netkwesties"/>
The largest Dutch publisher, DPG Media, acknowledged that archive.today is "a thorn in the side of many publishers (and journalists)" but noted that enforcement is difficult because the site is operated anonymously, hosted across multiple servers that frequently change location, and resurfaces under new domains when one is taken down.<ref name="Netkwesties"/>
==== Comparison with Internet Archive ====
Commentators have contrasted archive.today with the Internet Archive's Wayback Machine. The same paywalled ''Volkskrant'' article shared via archive.ph was also found archived in the Wayback Machine, demonstrating that both services can be used to circumvent paywalls.<ref name="Netkwesties"/> Like the Wayback Machine, archive.today does not advertise paywall circumvention among its stated features; the ability to bypass paywalls is a byproduct of its core function of archiving web pages as they appear to visitors.<ref name="Netkwesties"/>
==== AI agents ====
A 2025 investigation by journalist Henk van Ess found that AI chatbots—including ChatGPT, Perplexity AI, Grok, and Claude—exploit web archives to bypass paywalls during live web searches.<ref name="Netkwesties-ai">{{cite web |last1=Van Ess |first1=Henk |last2=Aanzee |first2=Peter |date=20 July 2025 |title=Hoe AI-bots stilletjes aan zoekers gratis artikelen kunnen bieden |url=https://www.netkwesties.nl/2124/hoe-bots-stilletjes-aan-zoekers-gratis.htm |work=Netkwesties |language=nl |access-date=27 April 2026 |archive-date=30 April 2026 |archive-url=https://web.archive.org/web/20260430055959/https://www.netkwesties.nl/2124/hoe-bots-stilletjes-aan-zoekers-gratis.htm |url-status=live }}</ref> In one documented case, ChatGPT retrieved a full article from ''The Economist'' via archive.today and then generated a five-point economic analysis in the publication's characteristic style and terminology.<ref name="Netkwesties-ai"/> Van Ess identified six distinct methods of paywall circumvention by AI systems, of which "archive exploitation"—finding archived copies on services such as archive.today and the Internet Archive—was the most direct.<ref name="Netkwesties-ai"/> Unlike documented concerns about AI training unpaywalled content, this behaviour involves real-time retrieval through archived copies during individual queries, effectively extending paywall circumvention beyond human users to automated agents.<ref name="Netkwesties-ai"/>
== Academic research ==
A 2018 study by researchers at University College London, University of Alabama at Birmingham, and Cyprus University of Technology, published at the AAAI International Conference on Web and Social Media (ICWSM), analysed 21 million URLs from archive.is's live feed and 356,000 archive.is URLs shared on Reddit, Twitter, Gab, and 4chan's /pol/ board over 14 months.<ref name="Zannettou">{{cite journal |last1=Zannettou |first1=Savvas |last2=Blackburn |first2=Jeremy |last3=De Cristofaro |first3=Emiliano |last4=Sirivianos |first4=Michael |last5=Stringhini |first5=Gianluca |date=2018 |title=Understanding Web Archiving Services and Their (Mis)Use on Social Media |journal=Proceedings of the International AAAI Conference on Web and Social Media |volume=12 |issue=1 |doi=10.1609/icwsm.v12i1.15018 |url=https://cdn.aaai.org/ojs/15018/15018-28-18537-1-2-20201228.pdf |access-date=27 April 2026}}</ref> The study found that news articles and social media posts were the most commonly archived content types, likely due to their "perceived ephemeral and/or controversial nature."<ref name="Zannettou"/>
The researchers documented that archive.is URLs were extensively shared on "fringe" communities such as 4chan's /pol/ board and the Reddit subreddit r/The_Donald, both to preserve potentially contentious content and to deny ad revenue to news outlets perceived as ideologically opposed.<ref name="Zannettou"/> Moderation bots on r/The_Donald automatically blocked direct links to certain news sites—for example, 46% of links to the ''New York Daily News'' were censored—and prompted users to post archive.is URLs instead.<ref name="Zannettou"/> The authors estimated that ''The Washington Post'' lost approximately US$70,000 per year in ad revenue due to the practice of sharing archived copies rather than direct links on Reddit alone.<ref name="Zannettou"/>
On Reddit, bots were responsible for posting 44% of archive.is links and 85% of Wayback Machine links across the studied subreddits, driven by moderators aiming to mitigate link rot.<ref name="Zannettou"/>
A 2023 study by researchers at the University of Amsterdam, as part of the vera.ai project, examined 1,991 fact-checking articles from the European Digital Media Observatory's "War in Ukraine" dataset. Of 41,758 extracted links, 6,002 were archived pages. archive.today was the most-used link archiving service, at 44.1%, ahead of the Internet Archive/Wayback Machine (29.2%) and Perma.cc (26.6%).<ref>{{Cite web |title=Mapping the 'memory loss' of disinformation in fact-checks |url=https://www.veraai.eu/ |access-date=2026-05-13 |website=www.veraai.eu |archive-date=24 November 2022 |archive-url=https://web.archive.org/web/20221124162524/https://www.veraai.eu/ |url-status=live }}</ref> Fact-checkers primarily used these services to preserve ephemeral and platform-restricted content, such as Facebook posts that are difficult to capture due to anti-bot measures.<ref>{{Cite web |title=Losing our memory of fake news |url=https://cordis.europa.eu/article/id/442985-losing-our-memory-of-fake-news |access-date=2026-05-13 |website=CORDIS {{!}} European Commission |language=en |archive-date=9 December 2025 |archive-url=https://web.archive.org/web/20251209075049/https://cordis.europa.eu/article/id/442985-losing-our-memory-of-fake-news |url-status=live }}</ref>
== Worldwide availability == === Australia and New Zealand === {{see also|Internet censorship in Australia|Internet censorship in New Zealand}}
In March 2019, the site was blocked for six months by several internet providers in Australia and New Zealand in the aftermath of the Christchurch mosque shootings in an attempt to limit distribution of the footage of the attack.<ref>{{cite web|last=Chen|first=Caleb|title=ISPs in AU and NZ start censoring the internet without legal precedent|url=https://www.privateinternetaccess.com/blog/2019/03/isps-in-au-and-nz-start-censoring-the-internet-without-legal-precedent/|website=Private Internet Access|access-date=20 March 2019|date=19 March 2019|archive-date=28 April 2023|archive-url=https://web.archive.org/web/20230428152352/https://www.privateinternetaccess.com/blog/isps-in-au-and-nz-start-censoring-the-internet-without-legal-precedent/|url-status=live}}</ref><ref>{{cite web|last=Menegus|first=Bryan |date=19 March 2019 |title=New Zealand ISPs Say They're Blocking Sites That Fail To Remove Christchurch Shooting Video |url=https://www.gizmodo.com.au/2019/03/new-zealand-isps-say-theyre-blocking-sites-that-fail-to-remove-christchurch-shooting-video/ |url-status=dead |archive-url=https://web.archive.org/web/20190518223849/https://www.gizmodo.com.au/2019/03/new-zealand-isps-say-theyre-blocking-sites-that-fail-to-remove-christchurch-shooting-video/ |archive-date=18 May 2019 |access-date=20 March 2019 |work=Gizmodo Australia}}</ref>
=== China === {{See also|Internet censorship in China}}
According to GreatFire.org, archive.today has been blocked in mainland China {{as of|2016|3|post=,|since=y|lc=y}}<ref>{{cite web|url=https://en.greatfire.org/archive.is|title=archive.is is 100% blocked in China|date=12 August 2018|website=GreatFire Analyzer|archive-url=https://web.archive.org/web/20251112225252/https://en.greatfire.org/archive.is|archive-date=2025-11-12|url-status=live}}</ref> archive.li {{as of|2017|9|post=,|since=y|lc=y}}<ref>{{cite web|url=https://en.greatfire.org/https/archive.li|title=archive.li is 100% blocked in China|date=12 August 2018|website=Great Fire Analyzer|archive-url=https://web.archive.org/web/20260106051454/https://en.greatfire.org/https/archive.li|archive-date=2026-01-06|url-status=live}}</ref> archive.fo {{as of|2018|7|since=y|lc=y|post=,}}<ref>{{cite web|url=https://en.greatfire.org/https/archive.fo|title=archive.fo is 100% blocked in China|date=12 August 2018|website=Great Fire Analyzer|archive-url=https://web.archive.org/web/20251001012728/https://en.greatfire.org/https/archive.fo|archive-date=2025-10-01|url-status=live}}</ref> as well as archive.ph {{as of|2019|12|post=.|since=y|lc=y}}<ref>{{Cite web|title=archive.ph is 100% blocked in China|url=https://en.greatfire.org/https/archive.ph|access-date=7 April 2022|website=en.greatfire.org|archive-date=29 April 2022|archive-url=https://web.archive.org/web/20220429215631/https://en.greatfire.org/https/archive.ph|url-status=live}}</ref>
=== Finland === {{see also|Internet censorship in Finland}}
On 21 July 2015, the archive.today blocked access to the service from all Finnish IP addresses, stating on Twitter that they did this in order to avoid escalating a dispute they allegedly had with the Finnish government.<ref>{{cite web|url=https://www.iltalehti.fi/digi/a/2015072220070969|title=Suomalaisilta estettiin haktivistien suosimalla verkkosivulla käynti|last1=Lapintie|first1=Lassi|date=22 July 2015|work=Iltalehti|trans-title=Finns' access to website used by hacktivists blocked|archive-url=https://web.archive.org/web/20190527064017/https://www.iltalehti.fi/digi/a/2015072220070969|archive-date=27 May 2019|url-status=live|access-date=4 March 2016|language=fi}}</ref><ref name="Toler">{{Cite web |last=Toler |first=Aric |date=2018-02-22 |title=How to Archive Open Source Materials |url=https://www.bellingcat.com/resources/how-tos/2018/02/22/archive-open-source-materials/ |access-date=2026-02-17 |website=bellingcat |language=en-GB |archive-date=17 August 2025 |archive-url=https://web.archive.org/web/20250817101615/https://www.bellingcat.com/resources/how-tos/2018/02/22/archive-open-source-materials/ |url-status=live }}</ref>
Since the conflict with the Finnish blogger in early 2026, the website displays an unpassable captcha to visitors from Finland.<ref>{{cite web | title=Internet | Suomalaismies suututti suositun verkkopalvelun ja joutui hyökkäyksen kohteeksi | date=20 March 2026 | url=https://www.hs.fi/visio/art-2000011856745.html | access-date=7 May 2026 | archive-date=21 March 2026 | archive-url=https://web.archive.org/web/20260321125802/https://www.hs.fi/visio/art-2000011856745.html | url-status=live }}</ref>
=== Russia === {{See also|Internet censorship in Russia}}
In 2016, the Russian communications agency Roskomnadzor began blocking access to archive.is from Russia.<ref>{{cite web|url=https://tjournal.ru/21966-roskomnadzor-zablokiroval-servis-archive-is-hranyashchiy-kopii-veb-saytov|script-title=ru:Роскомнадзор заблокировал сервис archive.is, хранящий копии веб-сайтов|last1=Elistratov|first1=Vladimir|date=29 January 2016|access-date=30 January 2016|website=TJournal|title=Roskomnadzor zablokiroval servis archive.is, khranyashchiy kopii veb-saytov|archive-url=https://web.archive.org/web/20170830055553/https://tjournal.ru/21966-roskomnadzor-zablokiroval-servis-archive-is-hranyashchiy-kopii-veb-saytov|archive-date=30 August 2017|url-status=live|language=ru}}</ref><ref>{{cite web|url=https://www.techdirt.com/articles/20160203/08365233504/russia-blocks-another-archive-site-because-it-might-contain-old-pages-about-drugs.shtml|title=Russia Blocks Another Archive Site Because It Might Contain Old Pages About Drugs|last1=Cushing|first1=Tim|date=4 February 2016|work=Techdirt|archive-url=https://web.archive.org/web/20190323131754/https://www.techdirt.com/articles/20160203/08365233504/russia-blocks-another-archive-site-because-it-might-contain-old-pages-about-drugs.shtml|archive-date=23 March 2019|url-status=live|access-date=26 February 2016}}</ref><ref name="Toler" />
On 23 March 2026, archive.today and several mirror domains were blocked by Russian authorities.<ref>{{cite web|last=Whittaker|first=Zack|url=https://techcrunch.com/2026/03/23/russian-authorities-block-paywall-removal-site-archive-today/|title=Russian authorities block paywall removal site Archive.today|date=2026-03-23|website=TechCrunch|access-date=2026-03-23|archive-url=https://web.archive.org/web/20260323225309/https://techcrunch.com/2026/03/23/russian-authorities-block-paywall-removal-site-archive-today/|archive-date=2026-03-23|url-status=live}}</ref>
== See also == {{Portal|Internet}} * {{anl|Digital preservation}} * {{anl|Link rot}} * List of web archiving initiatives
== Notes == {{Reflist|group=note}}
== References == {{Reflist}}
== External links == {{Commons category|Archive.today|archive.today|lcfirst=yes}} <!-- Per WP:ELMINOFFICIAL, choose one official website only --> * [https://wiki.archiveteam.org/index.php/Archive.today archive.today] at Archive Team wiki * {{srlink|Wikipedia:archive.today guidance}}
{{Digital preservation}} {{Tor onion services|state=collapsed}} {{Authority control}}
Category:History of the Internet Category:History of Wikipedia Category:Internet properties established in 2012 Category:Tor onion services Category:Web archiving initiatives