The Internet Archive is a 501(c)(3) non-profit, and, as an organization, is known for being one of the biggest repositories for the internet. They are legally recognized as a library by The United States and operate out of San Francisco, California.
The organization was founded in May 1996 by Brewster Kahle. The first page was archived that month and in October they had archived a significant portion of the existing internet.
In 2001 they added The Wayback Machine to their website.
Expansion into archiving more than just web content started in 1999. Today you can find multiple media types on their website including texts, video, audio, software, and images. The Internet Archive also began to include media to assist people with disabilities that made reading texts difficult or impossible. This included Digital Accessible Information System (DAISY) format.
Additionally, BitTorrent was added as a download option in 2012.
While they make a wide variety of digital content available, they prefer to focus their operations on books because they are a library. They scan up to 3,700 books per day and serve millions of patrons on a regular basis.
In November 2013, the building that housed The Internet Archive’s physical operations caught fire, damaging the facilities and equipment as well as destroying some irreplaceable items that were going to be digitized and added to the archive. The organization sought out donations to make repairs, which took place from 2014 to 2016.
Internet Archive of Canada was announced in 2016. This would essentially be a copy of The Internet Archive based in Canada. Rumors quickly spread that this was a reaction to Donald Trump’s upcoming presidency, and Brewster Kahle addressed these concerns and others in an FAQ post on The Internet Archive’s blog.
Previous to the announcement of Internet Archive Canada, partial copies of The Internet Archive were already located in Alexandria, Egypt, and Amsterdam, the Netherlands. They already had plans to create a copy in Canada, but statements made by Donald Trump during his campaign made them increase the speed at which they were working on Internet Archive Canada. In the FAQ that was released multiple specific statements were cited as reasons for working on Internet Archive Canada.
The Internet Archive Arts Residency began in 2018 and was created to connect artists with the large amount of artwork that The Internet Archive hosts online. This residency lasts a year and results in a body of work that is then exhibited. This connects digitized history with artists and is intended to help create new artwork that future generations can appreciate.
A major lawsuit was filed against The Internet Archive in 2020. This lawsuit was filed by some major publishing companies: Hachette Book Group, Penguin Random House, HarperCollins, and John Wiley. They claimed that The Internet Archive’s practice of Controlled Digital Lending (CLD) was a violation of copyright law and resulted in copyright infringement. The lawsuit went on until 2023, when the court ruled in favor of the publishing companies. The Internet Archive is now forbidden from digitizing and lending books that have digital copies available for sale.
That same year Universal Music Group, Sony Music and Concord also filed a lawsuit against The Internet Archive because of their Great 78 Project, which was supposed to digitize 500,000 individual songs released from 1880 to 1960. Most of these songs were never published in other media formats before and the goal was to make copies that were as historically accurate as possible. They claimed that the project resulted in copyright infringement and requested over 600 million dollars in damages. The lawsuit was settled in September 2025
The Internet Archive collaborated with Google in September 2024, replacing the Google Cache feature with The Internet Archive’s Wayback Machine so that users could learn more about webpages they visited.
In July 2025, The Internet Archive was made a Federal Depository Library by The United States Senate. This allows the organization to make public access records available on their website.
That same year in September they opened a new branch in Europe.
2024 also saw a series of cyberattacks on the organization. These attacks caused the organization’s services to be inconsistently accessible, sometimes rendering services unavailable for hours at a time. In May a hacker group called SN_BLACKMETA claimed responsibility for the attacks, although there were possible connections with Anonymous Sudan.
Attacks resumed in October, resulting in site defacement and a data breach. SN_BLACKMETA claimed responsibility for these attacks as well. 31 million user accounts were affected by the data breach and users’ email addresses and passwords were stolen by the hackers. The Wayback Machine and The Internet Archive’s website were both restored in a matter of days, but later that month threat actors struck and claimed that while SN_BLACKMETA performed the DDoS attacks that affected the website, and they were responsible for all other attacks. The next day The Internet Archive was restored and users were free to use it again.