Dropbox: How to REALLY not run a public beta


Man, am I having a bad week. No sooner had I been burned by beta testing Mendeley, I get absolutely toasted by trying out Dropbox. The goal of Dropbox, in case you haven’t heard of them, is to allow one to keep a folder of files transparently synchronized across multiple computers (and the web). In theory, all your computers will have the same set of files, transparently maintained by the Dropbox daemon (background program) running locally. Awesome, right? No more treking around an external drive, no more juggling multiple versions of files when you need to work on something on two or more platforms. It also handles folder sharing among users.

Fantastic. Awesome. Terrible. Sync is one of those killer applications that usually ends up killing the user, like a hand gun being passed around at a frat party. The result is often a solution worse than the problem, with data corruption and inconsistent data between locations a common failure mode. People have gotten it right recently, however. Apple has done a very good job with MobileMe, at least in terms of sync reliability. I have my complaints about them, but they’ve never messed up my data, even after more than a year and probably over 1000 synchronizations.

I was, therefore, perhaps a bit too unwary in trusting my data to Dropbox. I also figured that if they are already charging people they must have the bugs worked out, right? This is people’s data we’re talking about. No company is going to take control of your data with a product that is still buggy, right? Right?

Wrongo. It took me only two weeks of using Dropbox to find out I was grossly mistaken. Yesterday I moved a few large folders around on my linux machine, and the result was hopeless corruption of my Dropbox file system, with the server basically throwing its hands up and locking itself in the bathroom (folder “rejected by server”). Note that my problem wasn’t caused by conflicting edits made simultaneously to the same data on different devices (the typical difficulty with synchronization). Dropbox failed spectacularly just because I made multiple changes to ONE of my local copies and it got hopelessly confused. (Fortunately, I was able to restore everything from a backup on another computer, so you can stop sending cards and letters. I appreciate the sympathy, however.)

Talking with tech support, and looking at the forums, this is clearly a known issue that many users are having. A known issue that results in database corruption if you have the audacity to do something insane like move folders around! And they don’t mention this in the FAQ, let alone bright red flashing letters on their web page. Did I mention they are accepting legal tender for this product?

It seems to me their fundamental sync architecture is flawed (it apparently doesn’t record file operations in a way that is guaranteed to preserve the transformation of the file system from one state to another). I wonder if they don’t warn against this in their FAQ because they don’t want their VC funders (who are surprisingly big names) to know they are in over their heads, or if they are so far in over their heads they don’t know they have a problem. To do file sync, as far as I can tell, you basically have to be able to hook into all possible I/O operations on the disk and make sure you record every single change, in order, so that those operations can be “replayed” on the remote copy. I can’t think of another way to guarantee consistency. Maybe the folks at Dropbox found a way to avoid this complication. Maybe they were wrong. I’m not saying I’d be able to do better, and I know it’s a notoriously hard problem, but I’d hope that if I couldn’t solve it I’d at least know that I hadn’t. And I certainly hope I wouldn’t look for funding and customers before I’d solved it.

Looking at the Dropbox staff list, I should’ve been more careful. Its CEO and CTO seem like great guys, but they also look like they just started shaving last week. Their CTO, and well over half their staff, are very young, very recent MIT dropouts. With all the new humanities course requirements, I guess you can’t trust MIT undergrads with your data until they’ve gotten at least an MS. Either that, or MIT must cover some pretty important material senior year in Course 6. The fact that the first several iterations of their Mac OS X client didn’t even synchronize all possible parts of a file (despite not informing the user of this) should’ve been a red flag that Dropbox was not being run with a whole lot of discipline or adult supervision.

Am I just writing this to complain? Of course not. I would never do that! I’m writing this with the hope that my experience may prevent at least one other person from wasting their time with Dropbox, or losing their data. I’m also writing about this because my experience with Dropbox, as well as Mendeley, bring up interesting questions about VC technical vetting, a topic which I will discuss in my next post.


23 responses to “Dropbox: How to REALLY not run a public beta”

  1. It’s quite flattering to Mendeley be mentioned in the same sentence as Dropbox, which is kinda the poster child for “it just works” in terms of cloud storage.

  2. Hi Jon,

    Sorry you had a bad experience with Dropbox. I’ve used it but only sparingly, when I’ve needed to share a file that couldn’t be attached to an email. I’ve certainly not pushed the envelope with it.

    Alain

    • It’s great for sharing a folder or two, but I continue to have problems with it managing a huge file system with tens of thousands of files. I would stay the hell away from it when it comes to important work files that you need everyday. I’m increasingly convinced they really don’t know what they are doing. It works the vast majority of the time, but even having a small probability of messing up your customers files should be a killer for a company like this. They are doing fine with their legions of beta testers, who are practically sycophantic in their guileless acceptance of any problems they have with it. (Somebody on their forums actually apologized for filing a bug report, informing them that he had to re-download his entire 100 GB of files but that he wasn’t complaining or anything.)

  3. my two cents. I have been using dropbox for more than a year now, the same account on 4 computers, 1 windowsXP and 3 ubuntu, and it has been working flawlessly. And i do have a complex file structure, with several thousand of files that are modified dynamically as some of my aplications do their job (i am a physicist and write my own code to do my data analysis).

    • Thanks very much for writing, Matias. It worked great for me for a long time, and I’m not surprised it’s flawless when updating existing files; there are no race conditions when doing updates to any number of files as the updates can be done in any order. Start moving things around in your file hierarchy, however, and Dropbox can have problems.

      Check out the Dropbox forums, and you’ll see lots of people having problems, especially with the linux client.

  4. We are a small nonprofit using Dropbox for online offsite backup/archiving and remote access with a mac server (snow leopard) and recently noticed file failed to synchronized error messages for 191 files. Trying to track down has brought me to a host of possible reasons. It seems like the “cloud computing” applications are trying to get too much done with only a simple set of meta tags and in the zeal to make simple applications for everyone and everything to use, it becomes conflicted. I’m going to go back to basics, and make offsite, online backups that are not available for remote access, and treat them like the secure repository they should be. Dropbox.com clearly is not a solution.

  5. I’ll echo some of the complaints here. I started using dropbox about 6 months ago and the first round of synchronizations between my 3 PCs went fine……and then got sporadic. The 2 remote PCs still reflect my original file structure: Dropbox has failed to capture and synchronize the changes and additions I’ve made. I’ve puzzled through this on various forums but not being a techie I’m giving up! It’s really terrible when a product is touted as simple to use and reliable and seems to have great promise and then fails utterly to deliver.

    • I think the folks at Dropbox are a bit self-deluded about their product. Every week they come out with a new beta update with a few “minor” fixes and insinuate it’s almost ready, and that things are very reliable. And yet every week or so they find just “one more” bug that can cause something bad to happen to your data. It’s always just one “rare” issue after another with them.

      My theory about their current user base is that the vast majority of them are not serious users and have more enthusiasm than expertise, to put it gently. I’m guessing if they have data loss, corruption, or version issues, they don’t even notice it. Every serious business user I’ve spoken to whose tried Dropbox has had issues with it and didn’t trust it.

      It’s unfortunate, because I think they have a great concept, great interface and a great look and feel. As my favorite user experience person might say, they need more narwhal, less unicorn.

  6. I am a serious business user, and have had problems with Tomboy Notes losing data. But it was a self-inflicted wound, not thinking about the limitations of a system like Dropbox. It CANNOT know about files being open on multiple machines (many applications do not lock files). I solved the problem by only using the Sync option while Dropbox was connected with the server, and had no more problems. (I spend much of my day out of reach of a network.)

    I now use Dropbox as it is intended, only updating from one machine at a time, as a single user would. I am now happily and reliably using Dropbox. Yes I backup the dropbox folder locally occasionally.

    I have moved on from Tomboy Notes to TiddlyWiki and rename my working TiddlyWiki file periodically. I keep a host of files in Dropbox, but do not expect multi-user style access. I operate as one user on one machine at a time, the rest update instantly and I can effortlessly move to another machine once I have close all applications that are accessing file in the Dropbox folder.

    Don’t make the mistake of assuming that Dropbox (or anyone else) can ever solve the problem of multi-user access to files in the cloud. Google docs can, because the applications are on their servers, but I want the data local, and available offline.

    Dropbox has limits, but for one person on multiple machines and operating systems, it Rocks! And no, I don’t work for dropbox, I am using a (currently) free account. I will have no problems paying for the service when I no longer need to Sync with an Eee PC with 4Gb of disk space. Currently more that 2Gb would be a problem!

    Regards from a happy, professional, serious user.

    • Your point is well taken, Phil. I’ve found Dropbox to be fairly reliable if I just stick with the Mac client. Where it runs into problems is, as you say, when doing work while the client is offline. I’ve also realized that many of my problems stemmed from the Linux client, which I think has greatly lagged in quality. Having said that, I still suspect there are ways to “fool” all of the clients if you try moving folders with a lot of files around and then starting working inside those folders before the update is completed. This has nothing to do with working on the same file from multiple places, but simply the difficulties with doing remote sync. I’ve had few problems with DropBox, but that’s actually the problem. It seems to almost always work, but when it doesn’t you lose data and you don’t know it. The most dangerous software is one that is stable 99.9% of the time but 0.1% of the time destroys data.

  7. Having left a comment yesterday, I promptly discovered I had broken my own rules. I copied a small web site into Dropbox, forgetting it was under Subversion control. SVN hides hidden and read only files and folders everywhere. Needless to say, this broke Dropbox sync for that folder.

    This has made me think that many of the people complaining about the limitations of Dropbox in the Dropbox forums actually want it to behave like a transparent version control system. There is no such thing, of course. Tools like Subversion can manage multiple users accessing text files, but inevitably someone has to triage inconsistencies by hand occasionally.

    Sometimes sneaker-net with a USB stick is the only safe option.

  8. “With all the new humanities course requirements, I guess you can’t
    trust MIT undergrads with your data until they’ve gotten at least an MS.”

  9. Hi Jonathan,

    I stumbled on your threads-

    MENDELEY: HOW NOT TO RUN A BETA PREVIEW PROGRAM
    DROPBOX: HOW TO REALLY NOT RUN A PUBLIC BETA

    -just *after* installing Mendeley desktop.

    So having not used Mendeley can only listen to your opinion on that. And as a result of said opinion, confess i probably wont start any time soon (too bad mendeley). However! I recall your ignorance of Zotero which i’ve been using for several years now and find it at least as good as any of the commercial citation managers. But my curiosity is surely piqued- what *does* someone like yourself use to look after your pdf library, manage citations, and interface these with your manuscript-generating program of choice?

    And of course this is closely connected to sync: pdf libraries are complexish, if not big, but we want them where ever we are 24/7. Dropbox (with link shell extension), used primarily to sync my zotero library, does, basically, work. So far i’ve also managed (or been forced to) avoid the complexity of multiple OSs: still on winXP all round.

    And then horror- thought i’d try tortoise SVN to stay on top of my dissertation, but a timely read of Phil Stephens experience mixing it with DP has, um dampened that endeavour.

    In the end we want everything, everywhere, fully organised with 100% reliability and 0% time investment. This end is a long way off, but in the mean time, i would certainly appreciate your suggestions on how you deal with these issues.

    thnx for your column, cheers

    • Sorry for the delay in responding, Karl. I’m using Zotero now. While it’s missing a few things, it suffices, and it’s being rapidly improved, something I can’t say for Mendeley. I’ve completely given up on Mendeley, in fact. When I’m writing a paper, I’ll either use the Zotero word plug-ins (if I’m forced to use Word) or I’ll export to a BibTeX file if I’m using LaTeX.

      As for SVN, I’ve also been using that, and have found it to be very reliable. I’m not familiar with Stephen’s writing on it, but if you make sure to ALWAYS use SVN through an http interface (i.e. NEVER use the file:// interface to access the database directly) it’s extremely robust, I think. However, SVN is probably overkill for a dissertation. I’d say just get a good differential backup program and run it often. Oddly enough, the simplest solution I’ve found is the Apple Backup program that comes free with MobileMe. It backs up all my work files to the cloud every hour, functioning not only as a very reliable backup (people often make the mistake of backing up to a computer or disk a few feet away, which isn’t very redundent if you have a fire or earthquake or power surge and everything in your house gets fried) but also as reasonable version control. If you’re talking about mixing SVN with Dropbox (maybe you meant DP) then I can attest that is a terrible, horrible idea (and is tantamount to breaking my rule to not use direct access to an SVN repository above).

      • Jonathan – I’m very sorry Mendeley doesn’t meet your needs and we’d still like to solve whatever issues you ran across. However, you might want to qualify your statement above to note the fact that Mendeley has released several major updates to the Desktop and Web portions of the service since you wrote this post. In fact, there’s a new release to the Web every two weeks. Just today, we added a means of searching the catalog for Open Access material. I’m happy to hear any criticism or comments you may have.

        • I haven’t downloaded a new version for a month or so, but I was keeping up with Mendeley for a while. And there did seem to be a lot of updates, but I could never see much new in them. However, I will edit my comment to note that Mendeley regularly puts out updates. I do like the idea of a facility to search for open access material. Perhaps when I get some time in the near future I will check out Mendeley again and put together some more coherent thoughts on it.

      • Cheers Jon,

        Great to hear what works for you and what doesn’t on this aspect of academia. Seems i’m on the right track, although not sure if LyX is the right editor for this non-LaTeXer to write a dissertation / manuscripts with. Esp. given that i don’t need all of its great math features for a genomics dissertation. But, “InDesign”??! D8

        Great readings at your site, academic or otherwise (im a wanna-be-pilot). Checking back later, 🙂

  10. Maybe the dropbox problems are Linux related. I’ve been using Dropbox for a long time now with a very large file system with 10s of thousands of small files and thousands of large ones across 5-7 computers. I routinely muck around with the internal structure, moving large folders around and into one another, etc. and I’ve yet to experience a single hiccup. However, I do all of this on Windows machines of various kinds– I’ve never tried Dropbox on my Linux machines… and maybe now I won’t!

Leave a Reply to Fred Woodbridge Cancel reply

Your email address will not be published.