Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> You can only normalize a relational schema.

Normalization is just a method of organization to minimize repetition of data. It has nothing to do with efficiency of operation. This is perfectly valid code:

    person = {
        _id: "person123",
        username: "lloyd-christmas"
    }

    comment = {
        _id: "comment123",
        person: "person123",
        text: "This is how I start",
    }
You don't have to do:

    person = {
        _id: "person123",
        username: "lloyd-christmas"
    }

    comment = {
        _id: "comment456",
        person: {
            _id: "person123",
            username: "lloyd-christmas"
        },
        text: "This is also valid"
    };
Sure, a join is faster than the first one where you'll have to hit the DB twice. The point is that you don't have to START with denormalizing everything. I start with normalized data and do more DB reads than I need. I figure out how the application uses my data as I go along, and denormalize the pieces I need only once I need them and am confident I won't bump into consistency issues (my username isn't updating every 5 seconds). Through this process I realize what the actual relationships are in my application and how my app functions request to request. This allows me to better structure my data. This is a quick update in mongo and usually a couple of lines of refactoring in application logic.

Obviously this is just an MCVE. My original point was that I find this to be a drastically more flexible process than starting off relational.



You can normalize data in Mongo and replicate relational database features in your application layer.

I'm just curious how that's an upside to using a relational database from the beginning when your plan is to migrate to a relational database anyways.

Switch out "Mongo" for "Postgres" in your bulk paragraph and you have the same scenario but with less work on your part and more features to help establish your data model.

One upside I can see if it you're more familiar with Mongo where using a relational database slows you down.


Maybe my other comment might help clarify: https://news.ycombinator.com/item?id=11858939

The bottom level of our application layer is a query builder which is almost a drag and drop replacement between mongo and postgres. By the time that layer is built out, we know what our database needs to look like. I find that adding/dropping fields and models in mongo to be drastically faster than moving models around in postgres. The above example would obviously end up in the same structure, whether we started with relational or not. It was nothing more than demonstrating an iterative process where you don't need to START denormalized just because "that's why you use nosql".

We try to be as incremental as possible when building our apps, and have found that using nosql allows for 20 small refactors that often end up being 2 larger refactors with a relational db. We've just found that it ends up being a faster production process, and we end up with a much more application-specific database instead of just "This is a Person, this is an Address, this is a Comment". Sure, we know beforehand that the application will contain all those components. We don't necessarily know how they'll be used in a request-by-request basis, and whether or not they will actually end up being one-to-one, one-to-many, or many-to-many.


"It has nothing to do with efficiency of operation."

To nit-pick, I think normalisation improves the efficiency of updates, as you only have to modify the one place where the piece of data lies.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: