Between Mysql and PostgreSQL,which is suite for very large scale of data..for example, millions of record...i think,i should use PostgreSQL...any suggestion guys?

在Mysql和PostgreSQL之间,这是一个非常大规模的数据套件...例如,数百万的记录...我想,我应该使用PostgreSQL ...任何建议的家伙?

I think it depends a lot on what you mean by "better". You should probably identify your needs before choosing one or the other.


Faster? More reliable? Allows replication? Can do more complex queries? Is your application amenable to "sharding" in which case you probably want a database which can cluster and be administered more easily, or do you need everything in one massive set of linked tables, in which case you probably want good support for many cores and large memory. Do you have a complex authentication set up or is it a simple "one user" web application? Is the bulk of the data in binary objects, or is it simple numbers and strings? How will you do your backups?


MySQL and PostgreSQL both seem to be very capable databases, and both have been used successfully at large scale, so I'd suggest you need to identify the specific needs of your application first.


My inclination would be towards PostgreSQL, but that's mainly because I had a few disasters with MySQL losing data a few years ago and I haven't come to trust it again. PostgreSQL has been very nice in terms of being able to make backups easily.

我的倾向是PostgreSQL,但这主要是因为几年前MySQL发生了一些丢失数据的灾难而且我还没有再相信它。 PostgreSQL在能够轻松进行备份方面非常出色。



I've used both in similar situations, and sheer size of the DB doesn't seem to affect their scaling in substantially different ways. PostgreSQL is much more complete and solid, and will much better support complex queries and their optimization, while MySQL may shine in terms of retrieval speed for extremely simple queries; but these aspects are independent of the sheer size issue.

我在类似的情况下都使用了它们,并且数据库的庞大大小似乎并没有以完全不同的方式影响它们的缩放。 PostgreSQL更加完整和可靠,并且可以更好地支持复杂查询及其优化,而MySQL可以在极其简单的查询的检索速度方面发挥作用;但这些方面与庞大的问题无关。



Postgres has a richer set of abilities and a better optimizer; its ability to do hash joins often makes it much faster than MySQL for joins. MySQL is rumored to be faster for simple table scans. The storage engine you use underneath matters a lot, as well.


At some point, scaling becomes a choice between two options: scale by buying bigger hardware, or scale by introducing new machines (which you can shard the data to, use as slave replicas, or try a master-master setup -- both Posgres and MySQL have solutions of various levels of quality for these sorts of things).

在某些时候,缩放成为两种选择之间的选择:通过购买更大的硬件进行扩展,或通过引入新机器进行扩展(可以将数据分片,用作从属副本,或尝试主 - 主设置 - Posgres和MySQL为这些事物提供了各种质量水平的解决方案。

A few million rows of table data fit in a standard server's memory these days; if that's all you are doing, you don't need to worry about this stuff -- just optimize whatever database you are most comfortable with, to ensure the proper indexes are created, everything is cached (and something like memchached is used where appropriate), and so on.

如今,几百万行表数据适合标准服务器的内存;如果这就是你所做的一切,你不需要担心这些东西 - 只需优化你最熟悉的数据库,以确保创建正确的索引,缓存所有内容(并在适当的地方使用memchached) , 等等。

People mention that Facebook uses MySQL; that's kind of true. Kind of because what they are actually doing is using hundreds (thousands now?) of mysql databases, all of them responsible for their own little cross-section of the data. If you think you can load facebook into a MySQL (or postgres, or oracle) instance... well, they'd probably love to hear from you ;-).


Once you get into the terabyte land, things get difficult. There are specialized solutions like Vertica, Greenplum, Aster Data. There are the various "nosql" datastores like Cassandra, Voldemort, and HBase. But I doubt you need to go to such an extreme. Just buy a bit more RAM.

一旦你进入太字节的土地,事情变得困难。有专门的解决方案,如Vertica,Greenplum,Aster Data。有各种“nosql”数据存储区,如Cassandra,Voldemort和HBase。但我怀疑你需要走到这么极端。只需购买更多内存。



Well, it ultimately depends on what you are most comfortable with. According to MySQL, there is no imposed theoretical limit on the size of the database...it depends on the capability of the hardware supporting it. With the number of rows, using InnoDB, the theoretical limit is 256 terabytes. The reason I keep throwing out theoretical is that, there is probably a very small chance that you could possibly index 256 terabytes of data, so that is what they are approximating might be a limit. If you hit that max, you got bigger problems. Current users of MySQL in production, that I can think of, are YouTube and Facebook. Those are probably the two largest...and it appears that they are faring well.


But once again, as I stated above. It is whatever you are most comfortable with.


