Abolish the saving of documents

The status quo

The modern computing paradigm:

  • Documents are created and saved, then opened, modified, and saved again to the same filename.
  • While opened, a document can be modified. Many programs have an undo button to revert to one or more previous states. New modifications can then be made, and the document saved and closed. (When you close the application, these undo revisions are lost.)
  • In programs like Microsoft Office, the changes you make to a document are incrementally saved as temporary files, so that if the application crashes, you can usually retrieve most of your work since you last saved it. (When you intentionally close the application, these temporary files are lost.)
  • Some programs (wikis, Google Docs, Adobe Acrobat, Microsoft Word, OpenOffice Word Processor) greatly smooth collaboration by allowing changes to a document to be tracked under the name of the person who made those changes, so others can see what was changed and the differences between each contributor’s revision.
  • When you’re working on a document, and take a short break to start working on something else, you minimize or otherwise hide the application that the document is open in.
  • When you’re working on a document, and take a long break to work on something else or turn off the computer, you save the document and close the application that the document is open in. You then re-open the application and document manually when you want to work on it again.

But wait.  Think about it…

Aren’t these all really just variations on the same theme?

Continuous archival

In the future, I imagine that the whole concept of saving documents by hand will be abolished, and a new concept that takes all of these related ideas into account will evolve. The sooner we can push for this, the better.

Instead of manually saving files, closing applications, and re-opening them to continue working on a document, the state of the program and document will be transparently, continuously saved. Every time you make a change to the document or its settings, that change will be recorded, allowing you to access all previous revisions of the document. “Revision control”, “undo/redo”, “track changes”, “recovery files”, and “autosave” will all be combined into the same thing.

Instead of pressing Save, you might press something like Mark, Milepost, or Commit; something to indicate that this is a version you consider stable and that you might want to return to. (This is kind of like the difference between previewing and saving in a wiki.) Minor revisions might still be saved, too, but when you want to go back to a previous revision, you’d be able to focus on these major “mile markers”. (You could also go back and remove or add these “milepost” tags to specific revisions after the fact.)

Conscious maintenance of the files and program state will no longer be necessary. These are really just system-level details derived from historical needs that the user should not have to worry about. All the user should be concerned about is the state of their work.

Revision trees

One problem with traditional undo commands is that you can’t undo an undo. Say you’re using a graphics program, and you carefully draw a red star, undo it, and then draw a blue circle instead. Then you change your mind and want the red star back. Too bad. When you undid it and made a change to the document, the previous redo action was lost forever. You’ll have to draw the red star from scratch again, since your revisions can’t branch.

Ideally, you could still press Undo or Revert, but the application wouldn’t just be saving the previous edits in memory and going back a step; each change would be saved permanently. You could “close” an application and re-open it, and still undo that color correction that didn’t look quite right and redo it from the original image data. But the change that you reverted would be saved, too. You’d be creating a fork in the revision history, a branch in the revision tree, and if you undo something and make another modification, you can still go to the thing you undid, too. You could change your mind and keep both the blue circle and the red star.

If you want, you can fork a particular revision to an entirely new name. You might want to make a February newsletter based on January’s, for instance, and keep January’s as a separate file.

Collaboration

Edits that you make to the file would have a personal identifier attached to them (hostname/username? personal URN?), and you could optionally share parts of the revision history with other people, so that when they modify the document, you can view the differences between each version and see who made each edit.

This would also be useful when a script, package manager, or other system agent makes a change to a system file, as the change would be logged under the script’s name, date, and time, just like a human collaborator. When I (or something else) changes a system configuration file and it hoses my system, I will be able to revert to the last version or see what change caused the problem, without requiring me to make intentional backups manually or rename files.

This is good for sharing files because it allows collaboration. This is not so good for sharing files because:

  • All those revision diffs take up space and bandwidth
  • Revisions might contain things you don’t want other people to see

So I imagine that the revisions would be tagged “private” by default (more on tag-based filesystems later), and when you publish a document or send a file to someone, you would have to consciously choose to send the revision information as well. The default would be to send only the most recent stable revision (another place where the Commit button comes in):

You’re sending a file, but you’ve made further changes to this document since you last pressed Commit. Are you sure you want to send this version, or do you want to include the latest changes, too?

Setting things aside

I mentioned being able to undo a change to a document even after you’ve closed the application it is in. But then, why would we even need the concept of “closing” an application? If we’ve been successful in removing the distinction between “save/close/open” and “undo/redo”, then “close” and “minimize” are just variations on the same action: “put aside for now”. The only difference between these actions is the way the processor and memory handles them. So let’s get rid of this system-level distinction, too.

When you close an application or minimize it, what are you really doing? You’re just putting it out of sight while you work on something else.

I imagine that the system will become much more document-centric. I imagine that documents, disparate in format but related in concept, could be grouped into “projects” with tags, and you’d minimize/close the entire project to work on a different project, and then return to the project later. Instead of “minimizing” something, you’d be filing it away for working on at an indefinite later time. Instead of a taskbar that shows currently-open applications, you’d have a “project bar” that shows most recently worked-on projects. As you open something new, the oldest will “fall off” the end. (More on project-based workflow later.)

It already seems like a program that hasn’t been touched in a while gets “filed away” on the hard drive. If I come back to an application after hours spent doing something else, it’s apparent that the state of the program has to be loaded from the hard drive instead of memory, which was being used for other things. It “feels” as if the program’s progress is being reloaded from more permanent storage. So all this memory management and such could be handled automatically, so that there’s effectively no difference between minimizing an application for a few hours and closing it for a few hours.

Implementation

I’m not thinking too hard about how this grand scheme would be physically implemented; just thinking about what I would like to see from the user standpoint.

In the time between now and this future, we’ll have to deal with things like system files, which can’t really be automatically saved in an intermediate state. You really want to make the changes, triple check that there are no errors, and then save. (This is the major reason I’m afraid to use Scribes, for instance.) Automatic saving to the “original file location” doesn’t work here, though revision control would still be useful. This is another place where the Milestone or Commit button would come in.

Maybe the revision diffs would be stored in a separate physical file anyway. (Maybe the file system could even make this system compatible with older programs that don’t support it directly, by creating virtual folders for each file with the previous revisions represented as datestamped files, and the filesystem would assemble the older revision from the diffs on the fly when asked to load them.)

On the practical side, (bigger?) files will be saved using delta compression, so that only the current reversion and differences between previous revisions actually take up space on the drive. But hard drives are cheap, and will just get insanely cheaper over time. (Today, the average American can buy more than 50 billion characters of storage for an hour of labor.) So the space incurred by something like this for plain text, formatted text, HTML, and similar documents is completely negligible.

I’m sure there are algorithms for generating diffs of non-text files, too. Rsync can do this sort of thing, though it’s optimized for transmission over the network. Something optimized for dealing with disk sectors would probably be more efficient here. Really big binary files are rarely changed by the user, anyway. People doing full-length video editing have other ways of dealing with this stuff. I’m thinking more along the lines of text documents, spreadsheets, photos, drawings, mp3 tags, system configuration files, code, and so on. Of course the exact details would have to be optimized by someone who actually knows what they’re doing. 🙂

I’m sure all of this has been proposed in some form or other. It’s inspired by wikis, which have the revision control and multiple editors feature. Some allow reverting to previous versions, which allows undo/redo forking, though this isn’t clearly shown as a revision tree. Google Documents implements a lot of this now. I’m not really a programmer, and I’ve never been directly involved with CVS or subversion or anything like that, but I get the impression that these systems do a lot of this, including the revision trees.

But now let’s apply this to our own computers, in an intuitive usable way. All of these different concepts should be combined into one and applied to the documents and applications on my computer’s native interface that I use every day. The file paradigm on personal computers hasn’t really changed in decades.