Synchronized File Store

Submitted by scott on March 11, 2006 - 6:47pm

I have a problem that I think is fairly common these days. I have several machines I work on day to day and several different sets of files scattered in the winds. I'll be working on one machine and realize a file I want to view is on a different machine. I try to manually tidy up my mess when I can but it's a losing battle. I end up doing what I think many others do...I end up emailing files to myself to transfer them from one machine to another (I don't own a flash drive and this is usually the fastest method). Emailing while a fairly ugly way to transfer files sometimes is the easiest way to do it. For my home machines I can copy over the network but for my office machine it's not as easy.

So.....

I have a server that I do use as a fileserver but I tend not to store all my files there because I don't always have access to it. I need to have offline access to my files.

What I'm thinking of for a solution:

I want to store all files on my home directory on my server BUT have each laptop have a working copy of that set of files. I want to be able to synchronize the filesystems so whatever machine I'm working on with have the latest set of files. Also would be nice to be able to access the file store on the server directly from any computer/platform with little setup for those times when I'm using a public terminal. Also I would like it to be secure and communications to be encrypted.

I want to be able to work on the files if I am disconnected from the network and I want files to sync back up with the server automatically when I reconnect to the network.

It's a lot to ask but this should be feasible...at least in part. I was thinking I might be able to use RSYNC for the synchronization... I was even considering CVS or Subversion but thought this might be overkill and could be cumbersome... I even thought of using tools that turn GMail into a filesystem but I would rather host the files myself and not trust someone else. I am after all talking about ALL my digital files.

Anyone have any thoughts on a good solution? Or any feedback on how they work around the problem?

I think if I figure something out this could be awesome...

Sync

Give SyncBack a try: http://www.2brightsparks.com/ . It can sync to an ftp site for access from anywhere.

Cross-platform is a deal breaker

Hey Jason!

Good to see you found my little corner of the web.  You are now in my RSS feeds.

SyncBack looks interesting but it's only for Win32!  I run an assortment of machines:  Windows XP, OS X, Linux...

I really need a solution that will work across all my machines.

Thinking on this myself

And the one thought that bubbled to the top was a versioning filesystem (aka WinFS) that could track all changes and then detect when you are back on your "home" network and sync everything up.

It would then have to push out to other registered devices and update them in due course.

I want transparency, not an excuse to use Subversion :)

You are still going to end up with 3 way merge issue however. That would be a bugger to deal with when your files are binary or otherwise unviewable.

Not a simple problem, but then you are doing your PhD these days and have lots of spare time right?!?

Still researching

I was thinking along the same lines.  I believe there is an open source project looking at implementing a filesystem utilyzing Subversion as the underlying file store...but I believe it's only for Linux and I'm not sure of how stable it is and if it's not a cross platform tool it's not really that useful to me.

Yes the merge issue is a bugger, however in those situations you will almost always have to take some manual action.  You could attempt to do auto-merge but like you said that could only be feasible on non-binary blobs.  I think the merge issue typically won't be too common.  The whole point of this system is to have current versions of files always be on all machines.  The only time this would really occur is if you were doing disconnected edits on a file and at the same time editing that same file on another machine.  Probably not going to happen all that often with document files.  (famous last words)

Why? Well because if I'm editing my resume on my disconnected laptop, I don't start editing it on my laptop and then switch over to my desktop and start over editing it there. I would edit on the same machine until I was done. Then next time I connected my laptop...ZIP! That version should get copied to the server and then propagated to my desktop (this is the way I'm imagining this working). So now when I sit at my desktop I have the same version of my resume as my laptop does...no merge issue if I continue editing my resume on the desktop.

However, this could be a big issue for specialized files such as system files or files that change often outside of my control like a mail db file. These files *could* potentially be changed on both machines and if they were part of the synchronization process they would have major merge issues.

Maybe one way out would be to compare timestamps on files and if the timestamp is older than the version on the master file store then the changes are thrown out, otherwise those changes are applied.  Dangerous and you'd definately have to make sure the system clocks were all synchronized...

Doing a PhD - yes, lots of spare time - um no.  :)

Line Breaks...

Hmm line breaks seem to be stripped out of my comments for some reason...