|
|||
|
Hi everyone,
This is my first post ... I need some help coming up with the right architecture for my application. I've only had experience working with existing database systems, this is the first time I have to design something from the ground up. So please bear with me ![]() Imagine building a web application where customers will be able to list their own items (a bit like Ebay) add them to different categories etc .... and other people can change some of the attributes to that item. Also, after submitting a new item, it should be searchable by anyone. Here's my dilema : I plan on using Oracle to store all those items, but I'm concerned about the performance hit when people will be executing search queries on the db. I'll obviously have indexes created, but I'm not sure how well that will scale as traffic increases. I thought about using Berkeley Dbs for searching and fetching the items under a category, and hit Oracle only when looking up the details of an item, but then I would have stale data in the BDBs until they get updated. Any advice ? I don't have precise numbers of how many hits/sec I'll have to support, at this point I would just like some advice on what I should consider. Thank you all, -k |
|
|
||||
|
||||
|
|
|
|||
|
Keiko wrote: > Hi everyone, > > This is my first post ... I need some help coming up with the right > architecture for my application. I've only had experience working with > existing database systems, this is the first time I have to design > something from the ground up. So please bear with me ![]() > > Imagine building a web application where customers will be able to list > their own items (a bit like Ebay) add them to different categories etc > ... and other people can change some of the attributes to that item. > Also, after submitting a new item, it should be searchable by anyone. > Here's my dilema : I plan on using Oracle to store all those items, > but I'm concerned about the performance hit when people will be > executing search queries on the db. I'll obviously have indexes > created, but I'm not sure how well that will scale as traffic > increases. I thought about using Berkeley Dbs for searching and You think Berkeley DB will scale better than ORACLE?? REALLY? My company does web hosting. One of our clients has a sales site. They have thousands of products. They add new products and promotions regularly (At least weekly, if not daily). They have thousand of ciustomer hitting that site daily. The back end DB? ORACLE. The scalability and reliability just outweigh the added costs. And it would avoid the complexity you are thinking of adding by using two different DBMS products. And a general comment: get the functionality working first. While performance shouldn't be an afterthought, it also should not be your first thought. > fetching the items under a category, and hit Oracle only when looking > up the details of an item, but then I would have stale data in the BDBs > until they get updated. Any advice ? I don't have precise numbers of > how many hits/sec I'll have to support, at this point I would just like > some advice on what I should consider. > > Thank you all, > > -k Hardware speeds (CPU, disc, RAM) and network connection bandwidth are going to be bigger influences in your set up than the DBMS. HTH. Ed PS give me a call if you think you could use more help. Magic Interface, Ltd. 330-498-3700 |
|
|||
|
Thanks for your comments ! When you have lots of reads without the need
for SQL (such as when reading catalog info on some product page), It's my understanding that Berkeley DB would perform better because it doesn't suffer from the SQL overhead (is this an accurate statement ?), and it would minimize the hits to Oracle by eliminating an important number of queries. I expect a lot more searches/browsing than people loading up pages with item details. Shouldn't I try to distribute or minimize the load on Oracle ? >And a general comment: get the functionality working first. While >performance shouldn't be an afterthought, it also should not be your >first thought. That makes sense, thanks a lot. -k |
|
|||
|
"Keiko" <keikoh@mail.com> wrote in message
news:1140911852.194088.119930@z34g2000cwc.googlegr oups.com... > I expect a lot more searches/browsing than people > loading up pages with item details. Shouldn't I try to distribute or > minimize the load on Oracle ? You're assuming that the RDBMS is going to be the bottleneck. In my experience, with proper indexes, cache configuration, hardware sizing, etc., the RDBMS ceases to be the bottleneck (except in high load cases like real-time streaming data collection). As for your situation of lots of browsing vs. fewer detail pages, you might find that some sort of transparent query results cache helps scalability a lot. Some RDBMS implementations include a feature of caching the results of a query. I use MySQL, which can do this. I don't know if Oracle has an equivalent feature. Is the following relevant? http://www.orafaq.com/faqicach.htm Regards, Bill K. |
|
|||
|
"Ed Prochak" <edprochak@gmail.com> wrote:
> And a general comment: get the functionality working first. While > performance shouldn't be an afterthought, it also should not be your > first thought. For the case in point, that's doubtless fair. But if a given system has to scale to some fabulous degree, it is likely that it'll need a redesign at some point in order to address the fact that it'll be specific aspects of it that have to cope with growth. You can't know what the *actual* bottlenecks will be until it has "contact with the enemy" (e.g. - users :-)). You can't design up front to evade them all; it'll take some redesign to address the ones actually seen. When we deploy new systems, which get lots of traffic, there's always some surprise out of left field... -- output = ("cbbrowne" "@" "acm.org") http://linuxfinances.info/info/finances.html Signs of a Klingon Programmer - 8. "Debugging? Klingons do not debug. Our software does not coddle the weak. Bugs are good for building character in the user." |
|
|||
|
Christopher Browne wrote: [...] > Signs of a Klingon Programmer - 8. "Debugging? Klingons do not > debug. Our software does not coddle the weak. Bugs are good for > building character in the user." Stumbled over this, might be the language of choice :-) http://web.tampabay.rr.com/ngabwi/ /Lennart |
|
|||
|
> You're assuming that the RDBMS is going to be the bottleneck. In my
> experience, with proper indexes, cache configuration, hardware sizing, etc., > the RDBMS ceases to be the bottleneck (except in high load cases like > real-time streaming data collection). You're right, that is what I'm assuming, and it is because I'm having a hard time understanding why it wouldn't be. I could imagine some queries returning more than 20% of total rows (imagine querying for all items under some category), in which case oracle would do a full table scan. As my number of rows increases, I don't see how that scales even in the "short-term". I'll also want to support sorting on various fields (say price, title, etc ...) Won't Oracle just come down to its knees at some point ? Even if it can handle it, some queries would just take a long time, no ? can you tell I'm confused ?-k |
|
|||
|
"Keiko" <keikoh@mail.com> wrote in message
news:1141062335.470436.248990@e56g2000cwe.googlegr oups.com... >> You're assuming that the RDBMS is going to be the bottleneck. > > You're right, that is what I'm assuming, and it is because I'm having a > hard time understanding why it wouldn't be. The bottleneck is the component of a system that is hindering its performance the most. Even if the RDBMS is working very very hard, and perhaps even exceeding its ability to keep pace, there could be another part of the system that is falling behind worse. So if you were able to improve the ability of the RDBMS to respond, it wouldn't result in any visible change to the performance of the system. In any case, what I've learned about performance issues is that assuming one component or another will be the bottleneck rarely gives accurate understanding of the dynamics of the system. To be certain, you have to run some tests, with a volume of data and simulations of demand on the system. Have you looked into query result caching features in Oracle? I don't have experience using Oracle; I'm just guessing that it must have such a capability, since other "toy" databases like MySQL do. Oracle may even do query caching as a matter of course, for all I know. > I could imagine some > queries returning more than 20% of total rows (imagine querying for all > items under some category), in which case oracle would do a full table > scan. How could a user possibly look at a result set of 20% of your total rows? I see many people designing systems like this and then worrying about the scaleability, when what they should be worrying about is the usability. Look at some high-volume web sites like eBay, etc. They show you progressive screens asking you to narrow your focus. Sometimes they show you a sampling of rows of the broader search, but sometimes you're forced to get more specific before they show any results. Or else the results you are shown are also filtered as "featured" items, etc. Some RDBMS also support the concept of the "first N" records of a result set, which can return a subset of the results very quickly. Oracle does this with a pseudocolumn called ROWNUM, but using it is tricky. MySQL and PostgreSQL have a LIMIT clause that is much easier to use. > I'll also want to support sorting on various fields (say price, title, etc > ...) Sorting on an index that is 100% loaded into an index cache (in memory) should be very fast. If you have very frequent queries that need these indexes, do what it takes to get them cached. Regards, Bill K. |
|
|||
|
Keiko wrote: > Thanks for your comments ! When you have lots of reads without the need > for SQL (such as when reading catalog info on some product page), It's > my understanding that Berkeley DB would perform better because it > doesn't suffer from the SQL overhead (is this an accurate statement ?), Multiple reads would be cached by the web server. The DBMS would have little to do with that. (But I'm not involved in the day to day running of the sites. But I really doubt Berkeley DB has this feature and ORACLE doesn't.) Depending on the query, I know ORACLE is faster on the subsequent runs of the same query. > and it would minimize the hits to Oracle by eliminating an important > number of queries. I expect a lot more searches/browsing than people > loading up pages with item details. Shouldn't I try to distribute or > minimize the load on Oracle ? Until you know ORACLE cannot keep up (which would really surprise me), I suggest you not set up so many hoops for your programmers to jump thru. Ed |
|
|||
|
Keiko wrote: > > You're assuming that the RDBMS is going to be the bottleneck. In my > > experience, with proper indexes, cache configuration, hardware sizing, etc., > > the RDBMS ceases to be the bottleneck (except in high load cases like > > real-time streaming data collection). > > You're right, that is what I'm assuming, and it is because I'm having a > hard time understanding why it wouldn't be. I could imagine some > queries returning more than 20% of total rows (imagine querying for all > items under some category), in which case oracle would do a full table > scan. The number of rows of percentage of rows returned has NOTHING to do with whether ORACLE will do a full scan or not. The Optimizer calculates the costs associated with reading the indices and then reading the full table. A query for 20% of the rows based on an indexed column like product type likely will run thru the index. You are making bad assumptions. You need to work with ORACLE for a while to understand what it actually does. > As my number of rows increases, I don't see how that scales even > in the "short-term". I'll also want to support sorting on various > fields (say price, title, etc ...) Won't Oracle just come down to its > knees at some point ? Even if it can handle it, some queries would just > take a long time, no ? > > can you tell I'm confused ?> > -k Suggestion: Go read the ORACLE CONCEPTS Manual. (visit their web site.) Yes a poorly configured ORACLE DB can bog down easily, but such cases are more often due to poor datamodels than problems with ORACLE. Do you have a datamodel yet? Is it properly normalized? If you have further questions about ORACLE, I suggest posting in comp.databases.oracle.misc (since your questions are still rather general) and web based questions might go into comp.databases.oracle.tools I gotta go, but I don't say you are confused, just uninformed. Ed Ed. |
|
|||
|
"Ed Prochak" <edprochak@gmail.com> wrote in > Suggestion: Go read the ORACLE CONCEPTS Manual. (visit their web site.) > Yes a poorly configured ORACLE DB can bog down easily, but such cases > are more often due to poor datamodels than problems with ORACLE. > Agreed. But let me add to that. An ORACLE DB can get bogged down because of a clumsy logical model that gets reflected in bad table and column design. It can also get bogged down because of a clumsy physical model, which gets reflected in bad index design, bad tablespace allocations, and other bad settings. It can also get bogged down by badly written query or update transactions, even if those transactions are logically correct. It can also get bogged down by a volume or load that goes way beyond what was contemplated at design time. > Do you have a datamodel yet? Is it properly normalized? If you have > further questions about ORACLE, I suggest posting in > comp.databases.oracle.misc (since your questions are still rather > general) and web based questions might go into > comp.databases.oracle.tools Good questions and comments. The OP may be in for a long learning curve. |
|
|||
|
David Cressey wrote: > "Ed Prochak" <edprochak@gmail.com> wrote in > > > > Suggestion: Go read the ORACLE CONCEPTS Manual. (visit their web site.) > > Yes a poorly configured ORACLE DB can bog down easily, but such cases > > are more often due to poor datamodels than problems with ORACLE. > > > > Agreed. But let me add to that. > > An ORACLE DB can get bogged down because of a clumsy logical model that gets > reflected in bad table and column design. It can also get bogged down > because of a clumsy physical model, which gets reflected in bad index > design, bad tablespace allocations, and other bad settings. Which is just a more detailed description of what I said. So AGREED. Let me add this applies to ANY DBMS. Fools can make any system look bad. (The fact the the OP is asking for help shows he is not a fool.) > ... It can also get > bogged down by badly written query or update transactions, even if those > transactions are logically correct. Good point, which I didn't cover. > > It can also get bogged down by a volume or load that goes way beyond what > was contemplated at design time. Again, if his site gets that much traffic, then it is hardware that is needed at that point. Putting some other DBMS in front is NOT going to help. > > > > > Do you have a datamodel yet? Is it properly normalized? If you have > > further questions about ORACLE, I suggest posting in > > comp.databases.oracle.misc (since your questions are still rather > > general) and web based questions might go into > > comp.databases.oracle.tools > > Good questions and comments. The OP may be in for a long learning curve. But the main point in his favor is that he does want to learn! Thanks for the clarifying remarks. Ed |
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Re: experimental design question | Sigurd Hermansen | Newsgroup comp.soft-sys.sas | 0 | 12-12-2008 09:28 PM |
| Re: Creating Fractional Factorial Design | Ian Wakeling | Newsgroup comp.soft-sys.sas | 0 | 04-30-2007 10:39 AM |
| OT: design theory: wicked problems | Fehd, Ronald J. | Newsgroup comp.soft-sys.sas | 0 | 01-23-2007 03:04 PM |
| Re: How do I generate a Resolution V Fractional Factorial Design | David L Cassell | Newsgroup comp.soft-sys.sas | 1 | 08-08-2005 02:53 PM |