Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Data Management

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

    Search

    XML Feeds

    Google Ads

    « Do you use Column=@Param OR @Param IS NULL in your WHERE clause? Don't, it doesn't performDelay Validation with SSIS Tasks »
    comments

    Lee Everest created a post named MongoDB vs. SQL Server - INSERT comparison where he compared inserting 50001 rows with MongoDB vs. SQL Server. So he claims that MongoDB inserts 50001 rows in 30 seconds while the SQL Server one takes 1.5 minutes. Okay so I looked at this SQL Script and can make 2 improvements

    First create this table

    1. CREATE TABLE MongoCompare
    2.     (guid uniqueidentifier
    3.     ,value int
    4.     )
    5. GO

    Here is the script he used.

    1. DECLARE @id int = 1
    2. WHILE (@id < 500001)
    3. BEGIN
    4.     INSERT INTO MongoCompare VALUES (newid(), @id)
    5.     SET @id+=1
    6. END
    7. GO

    Now if I was to write that code I would write it with an explicit transaction and I would also use nocount on. So If we do this

    1. SET NOCOUNT ON
    2. BEGIN TRAN
    3. DECLARE @id int = 1
    4. WHILE (@id < 500001)
    5. BEGIN
    6.     INSERT INTO MongoCompare VALUES (newid(), @id)
    7.     SET @id+=1
    8. END
    9. COMMIT

    SQL Server now only takes 6 seconds and you have one atomic block of code, either all succeeds or nothing succeeds.




    *** Remember, if you have a SQL related question, try our Microsoft SQL Server Programming forum or our Microsoft SQL Server Admin forum

    About the Author

    User bio imageDenis has been working with SQL Server since version 6.5. Although he worked as an ASP/JSP/ColdFusion developer before the dot com bust, he has been working exclusively as a database developer/architect since 2002. In addition to English, Denis is also fluent in Croatian and Dutch, but he can curse in many other languages and dialects (just ask the SQL optimizer) He lives in Princeton, NJ with his wife and three kids.
    Social SitingsTwitterFacebookLinkedInHomePageFlickrLTD RSS Feed
    8131 views
    Instapaper

    15 comments

    Joel Coehoorn I hate to say it because I'm a dedicated relational guy, but there's still a small point in favor of the no-sql crowd:

    How many developers would think to use a transaction or nocount? Or, failing that, the scenario this most likely represents is 50,000 users each adding one record individually from the web. Nocount still makes sense, but I think the bigger optimization is the transaction, and that's not part of the picture any more.
    04/05/10 @ 20:39
    Comment from: SQLDenis [Member] Email
    SQLDenis Joel,

    Yes I agree that converting implicit transaction to explicit helps a lot. The way I load my data is bcp or BULK INSERT if the data is of significant size. BTW if anyone is doing 50000 insert I sure hope they are not using implicit transactions :-)

    But even if you have 50000 users adding it from the web it won't be less than 5 minutes. I don't believe that there are many sites that do 50000 inserts per minute.

    Also take a look at Paul Nielsen's post here http://sqlblog.com/blogs/paul_nielsen/archive/2007/12/12/10-lessons-from-35k-tps.aspx where he does 35K tps.



    04/05/10 @ 20:50
    Comment from: Alex Ullrich [Member] Email
    Alex Ullrich Joel makes a good point about the inserts coming from multiple sources. Is there any improvement from just adding the NoCount?

    Saw a similar post here: http://www.yafla.com/dforbes/The_Impact_of_SSDs_on_Database_Performance_and_the_Performance_Paradox_of_Data_Explodification/
    04/05/10 @ 21:11
    Comment from: SQLDenis [Member] Email
    SQLDenis >> Is there any improvement from just adding the NoCount?

    Not much, around 2 seconds
    04/05/10 @ 21:21
    Comment from: Alex Ullrich [Member] Email
    Alex Ullrich Heh, surprised its that much. Inserts are only one part of the picture of course :)
    04/05/10 @ 22:24
    Comment from: dm_mongodb [Visitor] Email · http://www.mongodb.org/
    dm_mongodb For a single table like this, I actually would not be surprised if an RDBMS is the same speed (give or take). A ton of work has been done on performance for these systems in the past.

    Where I think the performance benefit could be more significant is when there is some "unfair advantage" (that we want), such as some reasonable denormalization. For example, an ORDER/ORDER-LINEITEM schema, each order would typically be a single document in MongoDB. That should be faster than putting 1 row in orders and 7 in ORDER-LINEITEM. And the nice thing with mongo is I can still easily do a query like "find me all orders where sku 123 was ordered", as it 'supports reaching into objects'.

    Plus even at speeds in the same ballpark i love to use mongo for (1) ease of development via being a JSON db and (2) horizontal scaleout capability. (For appropriate use cases naturally.)
    04/06/10 @ 01:24
    Comment from: Lee [Visitor] Email · http://www.texastoo.com
    Lee The test was to view 500k transactions rather than one transaction with 500k rows in a single connection for both databases. There are flaws in this minimal test no doubt, but you make some great points.

    Lee
    04/06/10 @ 05:01
    Comment from: SQLDenis [Member] Email
    SQLDenis Got this in an email from Jeremy Lowell

    ------------------------------------------


    Interesting post.

    The point that I took away from it is that knowing the database engine that you are working in, and in this case, differing manners to handle transactions, is significantly more important than the engine that it's run against. I'd be quite surprised if someone was unable to better that 30 second load in MongoDB.

    For fun I decided to add a primary key to the mix and by default a PK on the first column in the table (guid). Script and timing of the script can be found below. This was run on my laptop with a singular 7500 RPM drive.

    CREATE TABLE MongoCompare
    (guid UNIQUEIDENTIFIER PRIMARY KEY
    ,VALUE INT
    )
    GO
    SET NOCOUNT ON
    DECLARE @id INT = 1
    WHILE (@id < 500001)
    BEGIN
    INSERT INTO MongoCompare VALUES (newid(), @id)
    SET @id+=1
    END
    GO
    -- Time elapsed = 2:44

    DROP TABLE MongoCompare
    GO
    CREATE TABLE MongoCompare
    (guid UNIQUEIDENTIFIER PRIMARY KEY
    ,VALUE INT
    )
    GO
    SET NOCOUNT ON
    BEGIN TRAN
    DECLARE @id INT = 1
    WHILE (@id < 500001)
    BEGIN
    INSERT INTO MongoCompare VALUES (newid(), @id)
    SET @id+=1
    END
    COMMIT
    -- Time elapsed = :12

    DROP TABLE MongoCompare
    GO
    CREATE TABLE MongoCompare
    (guid UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID() PRIMARY KEY
    ,VALUE INT
    )
    GO
    SET NOCOUNT ON
    BEGIN TRAN
    DECLARE @id INT = 1
    WHILE (@id < 500001)
    BEGIN
    INSERT INTO MongoCompare (VALUE) VALUES (@id)
    SET @id+=1
    END
    COMMIT
    -- Time elapsed = :07
    04/06/10 @ 05:18
    Comment from: dm_mongodb [Visitor] Email · http://www.mongodb.org/
    dm_mongodb the test below I ran in the mongo Javascript shell as a client. It inserted 50k objects in MongoDB on my desktop in 2 seconds. for simple small objects without a bunch of indexes i would expect 20k+ inserts/second with mongodb.

    http://gist.github.com/357622
    04/06/10 @ 07:44
    Comment from: Kristina [Visitor] · http://www.snailinaturtleneck.com
    Kristina Mongo has a batch insert that would be roughly equivalent to doing all the inserts in on transaction. It would be a more equitable contest to compare those speeds.
    04/06/10 @ 10:51
    Comment from: SQLDenis [Member] Email
    SQLDenis Kristina,

    SQL Server also has a minimally logged insert, you can use bcp or BULK INSERT
    04/06/10 @ 11:08
    Comment from: Funny [Visitor]
    Funny I did the same tests as dm_mongodb, and MongoDB owned SQL Server. It is at least 3x faster...
    04/11/10 @ 18:31
    Comment from: Dan [Visitor]
    Dan How well does Mongo handle concurrency? If the same test is run while another process/thread is reading the table, what happens?
    04/27/10 @ 14:04
    Dustin The comparison of speed is definitely pertinent, so I don't want to discount that. However, there's so much more to consider here. The document structure of MongoDB is a very natural fit for many web applications. You can pretty much skip an ORM because there's no mapping to do. However, you lose big on the inability to normalize data very well and you also lose big on ACID compliance (e.g. transactions, etc.) which is an absolute must have for certain things. So, while the speed of inserts is a very interesting conversation, please don't make your decision on MongoDB vs. SQL Server based on the insert benchmarks. The two databases really aren't competing against each other. They solve entirely different problems.

    That being said, this was an interesting read.
    06/15/10 @ 21:19
    Comment from: Andrew [Visitor]
    Andrew I did some sql server testing (using bltoolkit/linq and a table with around 20 columns) against mongodb using (norm) and the same class. In my case sql server took 4 seconds to do 10k inserts, or about 1.6 seconds if in a transaction.

    mongo took 10 seconds to 10k inserts, or 0.7 seconds if I did it in batch.

    And after various tests of other things, including lazy loading, foreign keys, I don't know. I'm not convinced mongo is going to be faster than sql server for anything I'm going to do. Particularly with norm, I'm tying myself to .net objects which means I'm not really schemaless, I just don't have to create the schema. I also messed with Expando which I like, and I have a project that it makes sense for, but in the general case I don't see it helping much. And sql server has stored proc's, change notifications, more authentication options, etc, genuine foreign keys with options for what to do when one is deleted etc. So, I think I have one project where mongo is clearly the perfect match, and the rest where sql server is probably the better choice.

    I've been trying to think if I was to redo some projects from scratch would I be able to make use of mongo and I'm still not sure. Give me triggers, notifications, and some logic for foreign keys/update/delete and then I could see mongo being the more reasonable choice for various projects depending on the data to be stored.
    04/12/11 @ 14:04

    Leave a comment


    Your email address will not be revealed on this site.

    Your URL will be displayed.
    (Line breaks become <br />)
    (Name, email & website)
    (Allow users to contact you through a message form (your email will not be revealed.)