Abolish the saving of documents

The status quo

The modern computing paradigm:

  • Documents are created and saved, then opened, modified, and saved again to the same filename.
  • While opened, a document can be modified. Many programs have an undo button to revert to one or more previous states. New modifications can then be made, and the document saved and closed. (When you close the application, these undo revisions are lost.)
  • In programs like Microsoft Office, the changes you make to a document are incrementally saved as temporary files, so that if the application crashes, you can usually retrieve most of your work since you last saved it. (When you intentionally close the application, these temporary files are lost.)
  • Some programs (wikis, Google Docs, Adobe Acrobat, Microsoft Word, OpenOffice Word Processor) greatly smooth collaboration by allowing changes to a document to be tracked under the name of the person who made those changes, so others can see what was changed and the differences between each contributor’s revision.
  • When you’re working on a document, and take a short break to start working on something else, you minimize or otherwise hide the application that the document is open in.
  • When you’re working on a document, and take a long break to work on something else or turn off the computer, you save the document and close the application that the document is open in. You then re-open the application and document manually when you want to work on it again.

But wait.  Think about it…

Aren’t these all really just variations on the same theme?

Continuous archival

In the future, I imagine that the whole concept of saving documents by hand will be abolished, and a new concept that takes all of these related ideas into account will evolve. The sooner we can push for this, the better.

Instead of manually saving files, closing applications, and re-opening them to continue working on a document, the state of the program and document will be transparently, continuously saved. Every time you make a change to the document or its settings, that change will be recorded, allowing you to access all previous revisions of the document. “Revision control”, “undo/redo”, “track changes”, “recovery files”, and “autosave” will all be combined into the same thing.

Instead of pressing Save, you might press something like Mark, Milepost, or Commit; something to indicate that this is a version you consider stable and that you might want to return to. (This is kind of like the difference between previewing and saving in a wiki.) Minor revisions might still be saved, too, but when you want to go back to a previous revision, you’d be able to focus on these major “mile markers”. (You could also go back and remove or add these “milepost” tags to specific revisions after the fact.)

Conscious maintenance of the files and program state will no longer be necessary. These are really just system-level details derived from historical needs that the user should not have to worry about. All the user should be concerned about is the state of their work.

Revision trees

One problem with traditional undo commands is that you can’t undo an undo. Say you’re using a graphics program, and you carefully draw a red star, undo it, and then draw a blue circle instead. Then you change your mind and want the red star back. Too bad. When you undid it and made a change to the document, the previous redo action was lost forever. You’ll have to draw the red star from scratch again, since your revisions can’t branch.

Ideally, you could still press Undo or Revert, but the application wouldn’t just be saving the previous edits in memory and going back a step; each change would be saved permanently. You could “close” an application and re-open it, and still undo that color correction that didn’t look quite right and redo it from the original image data. But the change that you reverted would be saved, too. You’d be creating a fork in the revision history, a branch in the revision tree, and if you undo something and make another modification, you can still go to the thing you undid, too. You could change your mind and keep both the blue circle and the red star.

If you want, you can fork a particular revision to an entirely new name. You might want to make a February newsletter based on January’s, for instance, and keep January’s as a separate file.


Edits that you make to the file would have a personal identifier attached to them (hostname/username? personal URN?), and you could optionally share parts of the revision history with other people, so that when they modify the document, you can view the differences between each version and see who made each edit.

This would also be useful when a script, package manager, or other system agent makes a change to a system file, as the change would be logged under the script’s name, date, and time, just like a human collaborator. When I (or something else) changes a system configuration file and it hoses my system, I will be able to revert to the last version or see what change caused the problem, without requiring me to make intentional backups manually or rename files.

This is good for sharing files because it allows collaboration. This is not so good for sharing files because:

  • All those revision diffs take up space and bandwidth
  • Revisions might contain things you don’t want other people to see

So I imagine that the revisions would be tagged “private” by default (more on tag-based filesystems later), and when you publish a document or send a file to someone, you would have to consciously choose to send the revision information as well. The default would be to send only the most recent stable revision (another place where the Commit button comes in):

You’re sending a file, but you’ve made further changes to this document since you last pressed Commit. Are you sure you want to send this version, or do you want to include the latest changes, too?

Setting things aside

I mentioned being able to undo a change to a document even after you’ve closed the application it is in. But then, why would we even need the concept of “closing” an application? If we’ve been successful in removing the distinction between “save/close/open” and “undo/redo”, then “close” and “minimize” are just variations on the same action: “put aside for now”. The only difference between these actions is the way the processor and memory handles them. So let’s get rid of this system-level distinction, too.

When you close an application or minimize it, what are you really doing? You’re just putting it out of sight while you work on something else.

I imagine that the system will become much more document-centric. I imagine that documents, disparate in format but related in concept, could be grouped into “projects” with tags, and you’d minimize/close the entire project to work on a different project, and then return to the project later. Instead of “minimizing” something, you’d be filing it away for working on at an indefinite later time. Instead of a taskbar that shows currently-open applications, you’d have a “project bar” that shows most recently worked-on projects. As you open something new, the oldest will “fall off” the end. (More on project-based workflow later.)

It already seems like a program that hasn’t been touched in a while gets “filed away” on the hard drive. If I come back to an application after hours spent doing something else, it’s apparent that the state of the program has to be loaded from the hard drive instead of memory, which was being used for other things. It “feels” as if the program’s progress is being reloaded from more permanent storage. So all this memory management and such could be handled automatically, so that there’s effectively no difference between minimizing an application for a few hours and closing it for a few hours.


I’m not thinking too hard about how this grand scheme would be physically implemented; just thinking about what I would like to see from the user standpoint.

In the time between now and this future, we’ll have to deal with things like system files, which can’t really be automatically saved in an intermediate state. You really want to make the changes, triple check that there are no errors, and then save. (This is the major reason I’m afraid to use Scribes, for instance.) Automatic saving to the “original file location” doesn’t work here, though revision control would still be useful. This is another place where the Milestone or Commit button would come in.

Maybe the revision diffs would be stored in a separate physical file anyway. (Maybe the file system could even make this system compatible with older programs that don’t support it directly, by creating virtual folders for each file with the previous revisions represented as datestamped files, and the filesystem would assemble the older revision from the diffs on the fly when asked to load them.)

On the practical side, (bigger?) files will be saved using delta compression, so that only the current reversion and differences between previous revisions actually take up space on the drive. But hard drives are cheap, and will just get insanely cheaper over time. (Today, the average American can buy more than 50 billion characters of storage for an hour of labor.) So the space incurred by something like this for plain text, formatted text, HTML, and similar documents is completely negligible.

I’m sure there are algorithms for generating diffs of non-text files, too. Rsync can do this sort of thing, though it’s optimized for transmission over the network. Something optimized for dealing with disk sectors would probably be more efficient here. Really big binary files are rarely changed by the user, anyway. People doing full-length video editing have other ways of dealing with this stuff. I’m thinking more along the lines of text documents, spreadsheets, photos, drawings, mp3 tags, system configuration files, code, and so on. Of course the exact details would have to be optimized by someone who actually knows what they’re doing. 🙂

I’m sure all of this has been proposed in some form or other. It’s inspired by wikis, which have the revision control and multiple editors feature. Some allow reverting to previous versions, which allows undo/redo forking, though this isn’t clearly shown as a revision tree. Google Documents implements a lot of this now. I’m not really a programmer, and I’ve never been directly involved with CVS or subversion or anything like that, but I get the impression that these systems do a lot of this, including the revision trees.

But now let’s apply this to our own computers, in an intuitive usable way. All of these different concepts should be combined into one and applied to the documents and applications on my computer’s native interface that I use every day. The file paradigm on personal computers hasn’t really changed in decades.

The desktop metaphor is stupid

My girlfriend is showing me images on her computer (she uses an Apple because it’s “simple”). After we view all of the files, which are scattered across her computer’s desktop in a random pattern, I watch as she grabs one with the mouse and drags it onto the corner of the desktop. She then drags one file after another onto the same spot, making a pile. Later, she’ll move them one by one out of that pile in order to look at them again.

Pretty much every window manager uses the desktop metaphor of overlapping documents in a messy pile. The desktop metaphor was a dumb idea when it was introduced, and it’s still a dumb idea now. Computers allow us to interact with documents in much richer, more powerful ways than pretending that they’re a mess of overlapping papers on our desk.

Why do we allow icons and windows to overlap? Why would anything ever overlap? It’s a useless functionality. When one window is overlapped partially by another, it’s usually the case that enough content is hidden by the overlap that it makes the visible content useless and just wasting screen space; you can read half a sentence, but the other half is hidden behind the focused window. You have to switch back to the partially-covered window anyway in order to continue using it, so why not just keep it entirely hidden?

We know that this is suboptimal, yet we continue to use it. The computer interface should maximize the usefulness of the information it presents to us. Windows should try to take up as much screen space as they can without infringing on other windows. Don’t overlap the windows, but try not to leave any wasted space, either.

Is this a blog or what?

I have more than 100 entries on Google Docs and hundreds of text files, self-emails, and pieces of scrap paper floating around my life, many of which contain ideas, thoughts, or potential projects that I’d like others to see. So I’m trying hard to get some in good enough shape that I can post them here.

Several people have sent me emails about other parts of my site, which I should migrate to here so they can post such things publicly, and I can respond here as well and everyone can bounce ideas off each other.

WordPress puts everything in chronological order like a stupid blog, which is not what I had in mind for this site. I’d rather organize the “articles” by hierarchical tags/categories and update them over time, reply to visitors’ comments, and so on. I installed the Audit Trail plugin for WordPress so my changes to posts are internally archived to prevent loss of information, and I’m installing other plugins to try to get the desired effect. I guess I should just install a wiki or something.

But until then, I really should post things that I can direct people to. Somewhere, anywhere.

Keeping windows on-screen

Another problem with the desktop metaphor: In every window manager I’ve ever used, if you move a window out of the way, there’s a tendency to move part of it off the screen. This is fine sometimes, but if you are moving it out of the way so you can still look at it, while also looking at another window (and wouldn’t you just minimize it otherwise?), this has all the same problems as overlapping windows.

For instance, if you take a window in any normal window manager:

|                         |
| .==============.        |
| |             ^|        |
| |             ||        |
| |             ||        |
| |             v|        |
| '--------------'        |
|                         |

and drag it off the screen, a lot is now hidden from view, including the scroll bars:

|                         |
|                         |
|                         |
|                         |
|                .========|:::::.
|                |        |    ^:
|                |        |    ::
|                |        |    ::
'-------------------------'    v:

Everything off-screen is now unreadable, and you can’t even scroll to access it.

It would be great if, when you move a window off the screen, the window would “smoosh” up against the edges of the screen, dynamically shrinking or increasing in size as you move towards or away from the edge, to keep everything visible.

In this case the bottom and right borders stop at the edge of the screen and the window resizes dynamically:

|                         |
|                         |
|                         |
|                         |
|                .=======.|
|                |      ^||
|                |      v||
|                '-------'|

Then all the content in the moved window is still accessible by text wrapping and/or scrolling.

Normally, when you maximize a window to fill the whole screen, it “remembers” the previous size and position, allowing you to restore it to its original size.

Similarly, when a window is “smooshed” like this against the edges of the screen, it will remember how big the window originally was. When you drag the window back towards the center of the screen with the title bar, it will “re-inflate” to its original size.

|                         |
|      .==============.   |
|      |             ^|   |
|      |             ||   |
|      |             ||   |
|      |             v|   |
|      '--------------'   |
|                         |

But this means that when the window is smooshed, the borders at the edges of the screen aren’t the same as normal borders. They should then be shown in a different style from normal borders, to differentiate them and remind the user that the window still remembers its previous state. If you resize the window inwards using these borders, it will forget the previous state and remain at the current size forever. The borders of the window will then revert to normal functionality. You could also force the window to forget its previous size just by clicking them, perhaps.

I’d say that all windows should do this by default — when would you ever want part of the window to be non-visible? — but there might be important reasons not to for some types of windows, like a program that really loads down the processor during window resize, or a window that can’t be shrunk below a certain size. Maybe the window manager could just add scroll bars in situations like this, when the window has been squished to its smallest size, which would still enable you to scroll around the window (including menu bars or whatever), though without the benefits of word wrapping, etc.

Joining window borders

When windows are tiled, each window continues to have the usual resizable borders; if you shrink one, the other stays untouched. But if edges and corners of windows can be snapped to each other, why can’t two such snapped edges be moved at once? The border that is shared between two windows should behave more like the “split” view you get in applications that let you view two parts of a single document at once.

Say you select two windows and you move them so they tile the screen horizontally; one is on the left half of the screen and one is on the right half:

|              |              |
|              |              |
|              |              |
|              |              |
|              |              |

Normally, the border in the middle, where the two windows meet (and have snapped to each other), consists of two independent resize handles that happen to be touching each other. If you want to make one window smaller and the other larger, it takes two operations. What should happen is that the borders should “fuse”, forming a split handle between them.

The individual window resize borders would still be present, with differentiation by the cursor changing when you hover over each, as is normally done to find the resize handles. So as you move the cursor from left to right, it starts as a regular arrow, changes to a resize cursor for the left window, then a split cursor for both windows, then a resize cursor for the right window, and then back to a regular arrow inside the right window.

Closeup of joint border between two windows (a few pixels wide):

Inside | Resize     | Resize  | Resize    | Inside
left   | right-hand | both    | left-hand | right
window | border of  | windows | border of | window
       | L window   | at once | R window  |

If you move the split line to the left, the left window shrinks while the right window grows, so they are always using up the entire screen.

|         |                   |
|         |                   |
|         |                   |
|         |                   |
|         |                   |

This could also work for corners, letting you drag both windows’ corners at once:

         | |
         | |
         | |

In fact, such a split line could form any time two windows have adjacent (snapped) edges, even if they’re not filling up the entire screen. This:

.---------.                   |
|         |                   |
|         |                   |
|         |                   |
|         '-------------------'

could be changed to this:

.--------------.              |
|              |              |
|              |              |
|              |              |
|              '--------------'

with one click and drag.

Resizing windows

If you double-click on a window resizer handle (either on the sides or the corners), it should do something helpful like “maximizing” in only one direction; expanding the edge or corner all the way to the edge of the screen, in the same way that double-clicking the title bar maximizes all four borders to the edge of the screen. (Maybe it or a similar function could also be used to maximize to the nearest window border without overlapping?) KDE has some intermediate options:

  • Left-click the Maximize button
    • The window maximizes to the entire screen
  • Middle-click the Maximize button
    • The window maximizes in the vertical direction, but stays the same width
  • Right-click the Maximize button
    • The window maximizes in the horizontal direction, but stays the same width

Although this functionality is useful, I think this is pretty unintuitive. Which button does which? Do people use this often enough that it should use up the default functions of all three mouse buttons?

Also, why not have windows snap to a coarse grid instead of just to the screen edges and each other? Is there any reason that windows have to be resizable down to the single pixel? This is very rarely necessary.

I’ve been using allSnap in Windows, which allows snapping to other window borders and coarse things like grids and half-way marks. It makes resizing and moving windows much quicker. You can always press Alt or Shift while dragging to override the snapping for the few instances you need it, just like you currently do in GNOME.