Colourful vintage lithograph illustration depicting six stacked, winding railway lines crowded with steam trains, in a style typical of late 19th-century American railroad advertising prints. Each track level shows a different train: freight cars in red, yellow, pink, and blue; a bright yellow-and-red passenger train crossing an iron truss bridge over a river with a sailboat; more boxcars in mixed colors passing a rail yard where workers signal with flags; and at the bottom, a station platform scene with a green-and-cream passenger train, a maroon baggage car, and a horse-drawn carriage surrounded by well-dressed 19th-century travelers. Steam locomotives with tall smokestacks and red wheels pull each train, with plumes of dark smoke rising against a pale blue sky and distant hills. — 'Zig-zag Passenger and Freight Train' by an unknown artist. Original from Library of Congress.| Rawpixel

Fixing Work Imports from Other OTW-based Archives

Brennan Kenneth Brown

July 2, 2026

8 min read

Last modified: July 1, 2026

Hello, hello! This is going to be one of my technical tutorial posts, but it is going to be even more niche than usual. By that I mean this will only be useful to... less than half a dozen people. But these people are my friends! Such as Melo running Superlove, and Agnes running Sunset, two of the OTWA-based writing archive websites.

So, let's start at the beginning. I run fanfiction.lol, a writing archive and community designed for both transformative works and original works alike! My site uses the OTWArchive codebase, which is the codebase powering Archive of Our Own. You can read my original announcement post for more information.

As of right now, there are two ways to add your own work to the site:

New Work: Where you add the work manually by hand.
- It's important to note that you should definitely be writing and saving your work elsewhere.
- Light HTML is accepted.
Import Work: Where you link to an already-existing work from another archive.
- This is an important feature, since fanfiction.lol is designed to be an alternative or mirror for people's work on AO3.

Unfortunately, importing doesn't work at all by default. This has been a problem for all other archives. And so, this is what I needed to fix!

The First Problem

Before anything else, attempting to import a work from Archive of Our Own returns the error:

"We couldn't successfully import that work, sorry: URL is for a work on the Archive. Please bookmark it directly instead."

The upstream OTWA codebase was written by and for Archive of Our Own. It has two built-in protections against importing from AO3 itself:

PERMITTED_HOSTS: A list of hostnames considered "the Archive". Any import URL whose hostname is in this list is rejected before the download even starts.
No AO3-specific parser: This one baffled me. Even if the block is removed, the generic fallback parser (parse_story_from_unknown) grabs the entire <body> of the fetched page, which on an AO3 work page is the full AO3 HTML including navigation, header, footer, etc. and not just the story text.

Both problems need to be fixed.

Part 1: Remove AO3 from `PERMITTED_HOSTS`

File: `config/config.yml`

Search for the PERMITTED_HOSTS key (around line ~750). You will find a list that includes AO3's production IP addresses and every AO3 domain variant:

PERMITTED_HOSTS: [
  # Production
  "104.153.64.122",
  "208.85.241.152",
  "208.85.241.157",
  "ao3.org",
  "archiveofourown.com",
  "archiveofourown.net",
  "archiveofourown.org",
  "download.archiveofourown.org",
  "insecure.archiveofourown.org",
  "secure.archiveofourown.org",
  "www.ao3.org",
  "www.archiveofourown.com",
  "www.archiveofourown.net",
  "www.archiveofourown.org",
  # fanfiction.lol  <-- this will be your fork's domain
  "fanfiction.lol",
  "www.fanfiction.lol",
  # Staging
  "insecure-test.archiveofourown.org",
  "test.archiveofourown.org",
  "testdownload.archiveofourown.org"
]

Replace the entire block so it contains only your own archive's hostnames:

PERMITTED_HOSTS: [
  # Your archive's own hostnames — imports from these are blocked (bookmark instead)
  "yourdomain.example",
  "www.yourdomain.example",
  "status.yourdomain.example"
]

Changing this will ensure URLs whose hostname is in this list are blocked from being imported (users should bookmark their own works, not import them) and URLs whose hostname is in this list are allowed in Abuse Reports.

So, keep your own domain(s) in the list. Remove everything AO3-related.

Bonus: Dead key in `config/local.yml`

You may find a permitted_hosts key (lowercase) in your config/local.yml. Due to how the OTWA config loader works (app_config.merge!(...) in config/application.rb), this key is a different key from PERMITTED_HOSTS (uppercase) and is silently ignored. The ArchiveConfig.PERMITTED_HOSTS method only reads the uppercase key from config.yml. You can safely remove the lowercase permitted_hosts block from local.yml to avoid confusion.

Part 2: Add an AO3-specific Story Parser

Even after removing AO3 from PERMITTED_HOSTS, the import will succeed but the chapter content will be the entire AO3 page HTML: navigation, header, login form, footer, and all. This is because the fallback parser doesn't know AO3's HTML structure.

This is the much more tricky part. You need to add a custom parser that knows how to extract only the story content from an AO3 work page.

Thankfully, I made this for you already!

File: `app/models/story_parser.rb`

Make the following three changes:

Change 1: Add the `SOURCE_AO3` regex constant

Find the block of SOURCE_* constants (around line 57):

SOURCE_LJ = '((live|dead|insane)journal\.com)|journalfen(\.net|\.com)|dreamwidth\.org'.freeze
SOURCE_DW = 'dreamwidth\.org'.freeze
SOURCE_FFNET = '(^|[^A-Za-z0-9-])fanfiction\.net'.freeze
SOURCE_DEVIANTART = 'deviantart\.com'.freeze

Add a new line before SOURCE_LJ:

SOURCE_AO3 = '(archiveofourown\.org|ao3\.org)'.freeze

Change 2: Add `ao3` to `KNOWN_STORY_PARSERS`

Find:

KNOWN_STORY_PARSERS = %w[deviantart dw lj].freeze

You'll see that the importer works for deviantART, Dreamwidth, and LiveJournal by default.

Add ao3 to it:

KNOWN_STORY_PARSERS = %w[ao3 deviantart dw lj].freeze

The list is checked in order and stops at the first match, so ao3 should come first (it is the most specific match for an AO3 URL).

Change 3: Add the `parse_story_from_ao3` method

This section required me to inspect the source code for a page of a work on AO3, and use a little regular expression.

Find the parse_story_from_deviantart method and add the following method after it (before shift_chapter_attributes):

def parse_story_from_ao3(_story, detect_tags = true)
  work_params = { chapter_attributes: {} }

  # Title: use the work heading, not the browser tab title
  title_node = @doc.at_css('h2.title.heading')
  work_params[:title] = if title_node
    title_node.inner_text.strip
  else
    @doc.at_css('title')&.inner_text&.sub(/\s*\[Archive of Our Own\]\s*$/i, '')&.strip.to_s
  end

  # Summary
  summary_node = @doc.at_css('.summary.module blockquote.userstuff')
  work_params[:summary] = clean_storytext(summary_node.inner_html) if summary_node

  # Author beginning notes (inside .preface.group, before chapter content)
  preface = @doc.at_css('.preface.group')
  if preface
    notes_node = preface.at_css('.notes.module blockquote.userstuff')
    work_params[:notes] = clean_storytext(notes_node.inner_html) if notes_node
  end

  # Story text: extract only from #chapters .userstuff, not the whole page body
  chapters_div = @doc.at_css('#chapters')
  if chapters_div
    userstuff = chapters_div.at_css('.userstuff')
    storytext = userstuff ? userstuff.inner_html : chapters_div.inner_html
  else
    storytext = @doc.at_css('body')&.inner_html || _story
  end
  work_params[:chapter_attributes][:content] = clean_storytext(storytext)

  if detect_tags
    meta_group = @doc.at_css('dl.work.meta.group')
    if meta_group
      rating = meta_group.css('dd.rating.tags li a.tag').map { |a| a.inner_text.strip }
      work_params[:rating_string] = convert_rating_string(rating.first) if rating.any?

      warnings = meta_group.css('dd.warning.tags li a.tag').map { |a| a.inner_text.strip }
      work_params[:archive_warning_string] = warnings.join(', ') if warnings.any?

      fandoms = meta_group.css('dd.fandom.tags li a.tag').map { |a| a.inner_text.strip }
      work_params[:fandom_string] = clean_tags(fandoms.join(ArchiveConfig.DELIMITER_FOR_OUTPUT)) if fandoms.any?

      relationships = meta_group.css('dd.relationship.tags li a.tag').map { |a| a.inner_text.strip }
      work_params[:relationship_string] = clean_tags(relationships.join(ArchiveConfig.DELIMITER_FOR_OUTPUT)) if relationships.any?

      characters = meta_group.css('dd.character.tags li a.tag').map { |a| a.inner_text.strip }
      work_params[:character_string] = clean_tags(characters.join(ArchiveConfig.DELIMITER_FOR_OUTPUT)) if characters.any?

      freeforms = meta_group.css('dd.freeform.tags li a.tag').map { |a| a.inner_text.strip }
      work_params[:freeform_string] = clean_tags(freeforms.join(ArchiveConfig.DELIMITER_FOR_OUTPUT)) if freeforms.any?

      published = meta_group.at_css('dd.published')
      work_params[:revised_at] = convert_revised_at(published.inner_text.strip) if published
    end
  end

  post_process_meta(work_params)
end

This is what the parser extracts from an AO3 work page:

Field	AO3 CSS selector
Title	`h2.title.heading`
Summary	`.summary.module blockquote.userstuff`
Author notes	`.preface.group .notes.module blockquote.userstuff`
Story text	`#chapters .userstuff`
Rating	`dd.rating.tags li a.tag`
Archive warnings	`dd.warning.tags li a.tag`
Fandom(s)	`dd.fandom.tags li a.tag`
Relationship(s)	`dd.relationship.tags li a.tag`
Character(s)	`dd.character.tags li a.tag`
Additional tags	`dd.freeform.tags li a.tag`
Published date	`dd.published`

Part 3: Deploy and Restart

I am using Docker to run my project, which means this is how I deploy and restart the Rails application for my site to have these changes take effect

docker restart <your-web-container-name>

For standard setups, it would look more like this:

touch tmp/restart.txt
# or
rails server restart

How the Parser Dispatch Works (for further customization)

When a URL is submitted for import, story_parser.rb calls get_source_if_known which iterates through KNOWN_STORY_PARSERS and tests each entry's SOURCE_* regex against the URL. The first match wins, and the corresponding parse_story_from_<source> method is called.

This means you can add parsers for any archive, not just AO3. For example, to add a parser for yourdomain.example:

Add SOURCE_YOURARCHIVE = 'yourdomain\.example'.freeze
Add 'yourarchive' to KNOWN_STORY_PARSERS
Write def parse_story_from_yourarchive(_story, detect_tags = true) following the same pattern

Since all OTWA-based forks share the same HTML structure, the same parse_story_from_ao3 method will also work for importing from other OTWA forks! (Superlove, Sunset, SquidgeWorld, etc.) Just add their domains to SOURCE_AO3 or create a separate source constant that shares the same parser method.

For example, to also support imports from superlove.sayitditto.net:

SOURCE_AO3 = '(archiveofourown\.org|ao3\.org|superlove\.sayitditto\.net|sunset\.femslash\.club)'.freeze

Since all these archives use identical OTWA HTML structure, the same parser handles all of them.

Testing, Limitations, and Conclusion

Go to Post > Import Work on your archive
Paste an AO3 work URL (e.g. https://archiveofourown.org/works/12345)
Select a language and press Import
You should reach a preview page showing only the story text, with tags auto-populated from AO3

As of right now, there are limitations with this parser. To start, restricted works (login-required on AO3) cannot be imported. Next, multi-chapter works imported from the main /works/ID URL will only import the first chapter's content (each chapter needs to be imported separately via its /works/ID/chapters/CHAPTER_ID URL, or you can implement chaptered parsing).

Also, the import does not carry over the original author name, as the work is posted under the importing user's account. So please only import your own work or work you have explicit permission to upload!

Anyways, that's all there is! It's really fun to write out these technical guides even when they only serve a small number of people. If you're from the future and have created a new OTWA-fork project, I hope this guide helps you as well!

Comments

To comment, please sign in with your website:

How it works: Your website needs to support IndieAuth. GitHub profiles work out of the box. You can also use IndieAuth.com to authenticate via GitLab, Codeberg, email, or PGP. Setup instructions.