Launching archiveJ. (open beta)

Handling long-running work around records and photos

When editing white papers, the most time-consuming part is reading countless materials by hand, editing them again, and shaping the direction through conversations with senior decision makers. The same is true when building a digital archive: the most sensitive work is organizing materials and generating many kinds of content from them. The archiveJ we built attaches an almost-free AI LLM agent directly to the folder and carries out the related work.

If we use an LLM, won’t the token cost get expensive?

Yes, if you hand it broad edits without understanding the code or the processing flow, and without optimizing anything, it will burn tokens on unnecessary work and on tasks you never asked for. But the materials in the digital archives and white papers we manage have a fixed scope, and these are things humans could already do. By attaching workers, or agents, to areas that only organizations with budgets or larger institutions used to handle, we created a path that lets individual researchers keep moving forward.

1) Agent work for organizing records

(Oral history) Leave interview transcripts in place and let the agent infer the context. (Photos) Add descriptions to photos and understand their context. (PDF, Hangul, txt) Convert documents into readable form and let the agent read previously created projects and documents. (Video) Understand the video content, including audio and any text inside it. (Video without audio) If we do not transcribe it, understand the video content and describe it as footage. (Link, website) Capture the site and read what it is about.

2) Agent work that creates new content from records

(General users) Provide search and browsing so people can understand what the organization has done by year through the digital archive. (Researchers) Publish a research guide so researchers can read deeply into the relevant materials. (Researchers) Cross-check related materials and provide base material that can be used in papers. (Content) Publish blog content based on generated materials. (Content) Publish exhibitions with theme and intent by combining multiple media in the direction the organization wants. (Content) Publish multiple collections based on published keywords.

3) Translation work and multilingualization

(Translation) Through this system, two to three hours of work and translation can quickly produce about 1,000 pages of translated material. This is not just simple sentence translation, but includes the item’s basic information, collected keywords, related pages in each language, and all generated content. (Translation and website) If 10,000 items are archived, the system can serve all 10,000 items in each language. (Translation) Including Google Translate and direct translation, we can publish related value more easily in multiple languages.

4) Current work and limitations

(Improvement) The archive owners and the development team discuss and organize instructions, missions, and the context of the materials together, and the AI agent refers to that. (Limitations) AI agents do not do everything perfectly. We need to keep investing time and effort so that humans can set accurate guides and missions and help them perform well. (Expansion) As it grows, researchers and clients will want to create more archives. But refining the content and the guidance will take more time.

June 13, 2026 TCG

Business

Browse

Collections

Browse paths

Launching archiveJ.