CNOUG's Archiver

ern 发表于 2005-5-3 16:20

Thomas Kyte, Lewis Jonathan VS Don Burleson, Mike Ault

[url]http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:38264759390157[/url]
[url]http://asktom.oracle.com/pls/ask/z?p_url=http%3A%2F%2Fwww.jlcomp.demon.co.uk%2Fthinking.html&p_cat=DEBATE_2005&p_company=10[/url]

VS

[url]http://www.dba-oracle.com/oracle_tips_ault_proofs.htm[/url]
[url]http://dba-oracle.com/oracle_news/2005_3_28_false_proofs.htm[/url]
[url]http://dba.ipbhost.com/index.php?s=aee573c8c03756096f4818fb5d9fb1b4&showtopic=1396&st=15[/url]

呵呵,双方都是牛人,不过Lewis Jonathan的网站我目前无法打开,谁能提供点资料?嗯,就我个人而论,我更喜欢前面两个:)。我已经贴了一篇Tom的,很不错。[url]http://218.94.123.17/viewthread.php?tid=56523[/url]
大家看看吧,无论如何都是很不错的文章,关于方法论的文章现在本来就不多,这次一下子出来N篇,抓紧啊

ern 发表于 2005-5-3 16:23

Understand the Oracle “proof” Myth--Mike Ault

Understand the Oracle “proof” Myth

Mike Ault

Be Careful In What You Prove

I have recently been researching scuba diving sites within the USA for a possible series of articles for a scuba magazine. Believe it or not, I do have interests outside of Oracle! One of the sites, Rock Lake, Wisconsin is of particular interest for what lies beneath its turgid waters. But if I reveal too much now it might dilute the future article. What it has brought to light is the fact that you must be careful not only in what you believe to be true but in what you prove to be true. The tales of what lay beneath Rock Lake vary from lost Atlantis to only legends. I am sure if the two camps were brought together, bloodshed would ensue as each defended their point of view. To each of the camps, what they believe is the truth.

Many people believe that truth is immutable, that what is true now is always true. However, times change, technology changes, even (especially) theology. It was not so long ago that respected scientists believed that a Doctor washing his hands was not professional, that traveling at speeds in excess of 30 miles per hour would do irreparable damage to the human frame, that traveling faster than the speed of sound would result in a crash much like hitting a brick wall, or that setting off an atomic bomb would ignite the atmosphere and annihilate mankind.

Of course all of the above has been proven incorrect. Now there are those that propose that the cherished notion that the speed of light is absolute is also a myth that will be dispelled with time (believe it or not, the "warp" drive of Star Trek fame has a basis in theoretical physics, of course it requires more energy than we can currently produce, but in medieval times wouldn't all of our current technology give us a one way ticket to the witches bonfire? ) however, as much energy as we can currently produce is perhaps a far cry from as much energy as we can produce. Some go as far as to claim that some super-novas are proof that some technologies lost control of zero-point energy.



So also must we view "proofs" given of things that are acceptable incurrent technology, in Oracle and in other technologies, as only transient in nature? Watch out that the "proofs" of today don't become tomorrows old wives tales.

For example, it wasn't too long ago that separation of indexes and tables in databases was a good and accepted method for improving performance. Of course this was because otherwise they would be on the same disk platter if they weren't separate and would conflict. Now of course with the almost universal implementation of RAID technology (redundant arrays of independent/inexpensive disks) this becomes unneeded (sometimes, depending on the amount of data stored on each disk in the array). It could be demonstrated, when they shared a single disk, that moving indexes away from tables improved performance. Based on that proof, no longer valid, some will say that moving indexes away from tables always improves performance.

Likewise the old saw that moving to RAW devices in UNIX (raw meaning the application was responsible for IO) would result in large improvements in performance. This of course was based on the premise that all of the buffering in the OS for non-raw filesystems caused delays. Now modern file systems can be set to eliminate this buffering and journaling delays and the performance gains from raw have diminished to nearly null. Of course rebuilding the objects within the databases involved, (tables and indexes) restoring them to proper parameters that they may have exceeded through years of neglect also helped.

So now we have a new crop of experts providing proofs (sometimes limited to a single-user, small database on a laptop) that their methods are the best and no doubt their proofs will be sited long after they are useful or meaningful and their expert advice will fade into old DBA tales as new technologies and methods become the rage. This is as it should be.

I guess I am trying to say, in a rambling way, that today sage advice becomes tomorrows old tale. We must all be aware of what the current methods are, realize when the old methods no longer apply, and gracefully accept new ways if we are to grow and prosper. However, we must also recognize when the "proper" method evolves and mutates into a "new" method, leaving the old ways to die away.

So be careful in what you prove and how you prove it. What seems clever today may come back to haunt you. You may be apologizing several years down the road for what is right now, but not in the future. The net is an amazing place. No doubt you can find numerous places where I may have expounded on the virtues of separation of indexes and tables, on rebuilding indexes frequently or other items that have since been proved, for current versions, old DBA tales. I like others before, am not immune to time and neither is my advice.

If any paper you read is older than a year or two, I suggest you take its advice with a big grain of salt as it may be applicable only to history and not current events.

Critics love to dig up old papers, presentations and advice given and use it to bludgeon people into believing they are the only authority. Take this type of advice with a grain of salt as well. In time, their advice will be referenced, out of date and out of style.

ern 发表于 2005-5-3 16:25

Do single-user scripts "prove" Oracle Performance?--Don Burleson

[url]http://dba.ipbhost.com/index.php?showtopic=1396[/url]

ern 发表于 2005-5-3 16:26

In search of the truth--Thomas Kyte

In search of the truth

Or Correlation is not Causation



Before I begin, I would like to point out that no technical information below is made up.  It is all real, verifiable data. Any numbers presented should be reproducible on your system (and if not, I'd like to know about it and understand why).  When I relate technology in a learning fashion, I strive to have examples that are representative and show how the software actually works.  I will not ever make up numbers to support a conjecture. Not here in this paper, not in any of my writings on the internet, and certainly not in any books I author.



This article was prompted by my reading of a couple of articles, namely “Understand the Oracle ‘proof’ Myth” by Mike Ault and “Important new Oracle performance myth is debunked” by Don Burleson (note: the thread referenced in that article used to be 12 pages long, it is now 11.  Hopefully anything I’ve referenced is still there when you read this.  You can see it has been selectively edited by noting comments like “QUOTE (ora_dba_guy @ Apr 2 2005, 05:19 PM)” in a follow up but there is no posting apparently from ora_dba_guy at that time anymore. Or Don’s second to last posting where he addresses a posting that no longer exists).   Jonathan Lewis has written an article “Can we have a sensible debate” that I think you should all read either before or after reading the “debunk” article this one.  It will really help frame the conversation.



I have two guiding principles with regards to Oracle.  One is that every day, each and every day, I learn something new about Oracle I did not know the day before.  Each and every day.  Maybe that is more of a fact than a principle but that is the way I approach the database.  That mindset helps me keep my mind open to someone correcting something I’ve said.



The other one of my guiding principles is that I ask for proof from everyone (yes, it can be annoying, it carries over into my personal life as well.  Ask my wife and kids).  Is this a scientific/mathematical proof as in “prove the square root of 2 is not a rational number”?  No, and it never has been.  What I need to see is proof as in evidence.  When I have a conjuncture, I try to rationalize, hypothesize why that conjecture might be true.  Then I provide supporting evidence.  Why do I do this?  Is it because I can generate written material faster?  No, not a chance.  I could produce 10 times the material I do now if I skipped the illustrative examples.  If I were to produce paragraphs instead of pages, I could be much more apparently productive.



Or would I be?  Actually, what I have found time and time again is that the only thing I would accomplish would be to be wrong more often than I already am.  When you rely on 'expert' advice, especially when you buy a book, or when you read a paper on the internet authored by a name known in the industry -- you expect more.  You expect to be able to have a certain level of trust in the contents.  You expect that the information, the advice being passed to you is more than likely "correct".



I know of only one way to accomplish that.  That is to provide some “proof”, evidence.  Some factual reasoning why what I believe to be true most likely is.  And present it in a fashion that lets the entire world look at it and say “I understand”.  Or, look at it and say “yeah, but…..”.  I love “yeahbuts”, I learn more from “yeahbuts” than anything else.  The "yeahbuts" out there look at the cause I am talking about, my hypothesized effects, my reasoning and say “yeah but we could accomplish it like this couldn’t we and it would be even better”.  That, that makes my day.



Experts that offer advice without supporting evidence are in my opinion the danger.  The advice might be actually good advice – when you understand the caveats that go with it.  So I’m not really sure what to make of articles like the “Understand the Oracle ‘proof’ Myth”.  From there I read:



What it has brought to light is the fact that you must be careful not only in what you believe to be true but in what you prove to be true



Huh,  “but in what you prove to be true”.  Forget for a minute the difference between “proving something to be true” and “providing some evidence that you know what you are talking about, factual evidence that can be tested by others” (Jonathan’s article covers that concept well) – I take exception to that statement.  It was “obvious” in the past, just common knowledge that “a Doctor washing his hands was not professional”.  Funny thing, I am rather glad someone decided to research this and debunk it.  Aren’t you?  (I sort of wish that atom bomb proof hadn’t been done, but that is getting into theology, personal beliefs and off topic).  I myself am glad there are people out there taking “the obvious” and showing that what was “obviously” true, is actually false.  He goes onto say:



So now we have a new crop of experts providing proofs (sometimes limited to a single-user, small database on a laptop) that their methods are the best and no doubt their proofs will be sited long after they are useful or meaningful and their expert advice will fade into old DBA tales as new technologies and methods become the rage. This is as it should be



So be careful in what you prove and how you prove it. What seems clever today may come back to haunt you. You may be apologizing several years down the road for what is right now, but not in the future..



Well, I would like to know who these experts are and where their proofs are.  But when asked repeatedly in a recent thread on their website, Mike and Don wouldn’t actually give any examples of who these people are, nor would they point us to one of these bad proofs.  The proofs (aka evidence) I repeatedly see is not a proof about “their methods”, but rather examples that demonstrate how the software called Oracle actually works.  The funny thing with the evidence (as you’ll see in that thread, Mike and Don have a very strong hangup with this word “proof” and have their own very strict definition for it) that these experts find so quaint is that they or anyone can use that evidence.  Anyone should be able to reproduce the findings, study them to see if what they show still holds true.  Without the supporting evidence, we have in fact no chance whatsoever of seeing if what they say even makes sense anymore.



For example, I’ll give a true to life performance tip.  You’ll see this in many books.  You’ll see it my book Effective Oracle By Design (and I will not be apologizing for it).  Now, there are two ways I could present this performance tuning tip I’m about to give.  Here is a hypothetical way, and then you’ll see how I really presented it:



<quote src=Never To be Printed by Tom>

Use Bulk Processing



It should be obvious that doing many operations in one call instead of many calls to the database is more efficient.  I have seen many times that fetching our entire result set using BULK COLLECT in one SQL statement runs faster (about twice as fast) than doing the same thing a single row at a time.



The response times you see will be improved. The more data you bulk-fetch, the better relative performance you will see from the BULK COLLECT.



The larger the array size, the fewer logical IO’s you will do and the better the performance and scalability. You need to rewrite the code using PL/SQL table types or collections. We needed to declare variables to fetch into.



Bulk fetching is truly a good thing.

</quote>



Those couple of paragraphs sum up very nicely the typical salient effects of Bulk Collecting in PLSQL.  It tells you that bulk collects are good, they do many things in a single call, they are super efficient.  You should use them, you should rewrite your code to use them.  There are no caveats here, you are not shown how to measure your return on investment, nor are you really given any measure beyond “about twice as fast”.  It is short, to the point.



But it is seriously lacking.  What happens if you read this and say, well if bulk collecting 100 rows is good, 1,000 must be really good and heck, why use LIMIT at all, just bulk collect all of the rows at once. Or what happens if most of your queries are 1 to 5 rows, does this hold as true (is the extra work worth it)?  Or, what happens if you own 10g and read this?



Well, fortunately, I won’t write like that.  Too many times in the past when just providing my experiences, I’ve been wrong.  I use pages where others might use paragraphs and try to convey an understanding of why what I say is true (or not, you can see for yourself!).  Here is a quote from that book (very similar to others you’ll see in fine books like Mastering Oracle PL/SQL: Practical Solutions by Connor McDonald) showing how I truly believe all technical information needs to be presented.  Your idea’s come out, what you think is true comes out, some evidence supporting that is shown, the numbers are discussed, caveats explored – it is all right there:



<quote src=Effective Oracle by Design>

Use Bulk Processing When It Has Dramatic Effects



As an example, we'll compare processing the EMP table 14 rows at a time versus processing it a row at a time. Here is the version for row-at-a-time processing:



ops$tkyte@ORA920> exec runstats_pkg.rs_start;

PL/SQL procedure successfully completed.



ops$tkyte@ORA920> begin

  2      for i in 1 .. 5000

  3      loop

  4          for x in ( select ename, empno, hiredate from emp )

  5          loop

  6              null;

  7          end loop;

  8      end loop;

  9  end;

10  /

PL/SQL procedure successfully completed.



And here is the version that uses bulk processing:



ops$tkyte@ORA920> declare

  2      l_ename    dbms_sql.varchar2_table;

  3      l_empno    dbms_sql.number_table;

  4      l_hiredate dbms_sql.date_table;

  5  begin

  6      for i in 1 .. 5000

  7      loop

  8          select ename, empno, hiredate

  9            bulk collect into l_ename, l_empno, l_hiredate

10            from emp;

11      end loop;

12  end;

13  /

PL/SQL procedure successfully completed.



Running Runstats to compare the versions shows the following:



ops$tkyte@ORA920> exec runstats_pkg.rs_stop(10000);

Run1 ran in 274 hsecs

Run2 ran in 132 hsecs

run 1 ran in 207.58% of the time



This shows that fetching our entire result set using BULK COLLECT in one SQL statement runs faster (about twice as fast in this case) than doing the same thing a single row at a time.



The response times you see will be a function of the amount of data you array-fetch, as well. More or less data in the result set will have a definite impact on the performance here. The more data you bulk-fetch, up to a point, the better relative performance you will see from the BULK COLLECT over time. For example, when I put 56 rows in EMP, the BULK COLLECT version was 380% better. When I put 1 row in EMP, both versions ran in the same amount of time. At some point, however, the BULK COLLECT will cease being more efficient, as the amount of RAM it consumes increases greatly. Where that point is varies, but I find a BULK COLLECT size of about 100 rows to be universally "good" in practice. Later, we'll look at using the LIMIT clause to control this.



Looking further in the Runstats report, we see some interesting numbers:



Name                                  Run1        Run2        Diff

STAT...session logical reads        80,522      15,525     -64,997

STAT...consistent gets              80,003      15,004     -64,999

STAT...buffer is not pinned co      70,000       5,000     -65,000

STAT...no work - consistent re      70,000       5,000     -65,000

STAT...table scan blocks gotte      70,000       5,000     -65,000

STAT...recursive calls              75,003       5,003     -70,000

LATCH.cache buffers chains         162,601      32,582    -130,019

Overall latching is reduced.

Run1 latches total versus runs -- difference and pct

Run1        Run2        Diff       Pct

188,736      58,658    -130,078    321.76%



PL/SQL procedure successfully completed.y



That is analogous to what we observed in SQL*Plus in Chapter 2, when we played with the ARRAYSIZE setting while using AUTOTRACE. The larger the array size, the fewer consistent gets we performed, and the better the performance and scalability. The same rules apply here, but the impact is not as transparent as just adjusting an ARRAYSIZE setting is. Here, we needed to rewrite the code using PL/SQL table types or collections. We needed to declare variables to fetch into. We used more memory in our session. We can use V$MYSTAT, a dynamic performance view, to see the net effect on memory usage.



It is for these reasons that I recommend using bulk processing only where and when it would have the most dramatic effect. In the example shown here, it looks dramatic. But that is only because we did it for 14  5,000 rows. It would be worthwhile here, if you did that process many times. If you did that process for 50 rows, you would discover they run in about the same amount of time and that the BULK COLLECT actually does more latching!

</quote>



There you have pretty much all of the facts – and perhaps most importantly you have some way to verify the truth.  Say you just bought that book (I wrote that book as Oracle 9iR2 was in full swing, Oracle 10g was still a glint in our eyes) and you read the advice.  You might be tempted to take it to heart and write all of your code using BULK COLLECT, or even worse, rewrite all of your existing code to do so.



Only to discover the expert is totally wrong.



Because things change. (just a side note, most everything else in that books applies to 10g!  And you have the examples available to you to see that)



But fortunately you bought the book with evidence, and you grabbed the examples from the website.  And you ran it in 10g.  And you observed:



ops$tkyte@ORA10G> exec runStats_pkg.rs_start

PL/SQL procedure successfully completed.



ops$tkyte@ORA10G> begin

  2      for i in 1 .. 5000

  3      loop

  4          for x in ( select ename, empno, hiredate from emp )

  5          loop

  6              null;

  7          end loop;

  8      end loop;

  9  end;

10  /

PL/SQL procedure successfully completed.



ops$tkyte@ORA10G> exec runStats_pkg.rs_middle

PL/SQL procedure successfully completed.



ops$tkyte@ORA10G> declare

  2      l_ename    dbms_sql.varchar2_table;

  3      l_empno    dbms_sql.number_table;

  4      l_hiredate dbms_sql.date_table;

  5  begin

  6      for i in 1 .. 5000

  7      loop

  8          select ename, empno, hiredate

  9            bulk collect into l_ename, l_empno, l_hiredate

10           from emp;

11      end loop;

12  end;

13  /

PL/SQL procedure successfully completed.



ops$tkyte@ORA10G> exec runStats_pkg.rs_stop(10000)

Run1 ran in 51 hsecs

Run2 ran in 48 hsecs

run 1 ran in 106.25% of the time



Name                                  Run1        Run2        Diff



Run1 latches total versus runs -- difference and pct

Run1        Run2        Diff       Pct

58,709      58,859         150     99.75%



PL/SQL procedure successfully completed.



Well, it would seem hardly worth it to do the bulk collect here wouldn’t it?  The reason – in Oracle 10g, PLSQL is silently array fetching 100 rows at a time for us, when we do “for x in ( select * from t “ – PLSQL has already bulk collected 100 rows.  We no longer needed to do that extra code, the extra work.



And, had I not had a test case, I might still not know that. I might still be giving the advice “bulk collect, everywhere”.  Someone reading my book was kind enough to email me and ask “why do I get these numbers when I run your example”.  60 seconds of research and I found out “why” (tkprof reveals a lot! A simple sql_trace=true and away we go).



Now, it is ironic, because in his quest to not have to show any evidence, Mike gives they reason why supporting evidence is a good thing:



So be careful in what you prove and how you prove it. What seems clever today may come back to haunt you. You may be apologizing several years down the road for what is right now, but not in the future. The net is an amazing place. No doubt you can find numerous places where I may have expounded on the virtues of separation of indexes and tables, on rebuilding indexes frequently or other items that have since been proved, for current versions, old DBA tales. I like others before, am not immune to time and neither is my advice.



If I had written the few paragraphs above (the ones without supporting evidence), then I would have to apologize.  I would have to apologize for not giving you the truth, the whole truth and nothing but the truth and a way to find out if it was still the truth.  Instead, I get to say “well, thank goodness you didn’t trust me and ran the examples in your testing/development environment.  That is precisely why I supply them to you”.  And then followup with this brand new feature of 10g that I was until that time not aware of (struck by the infamous but not elusive “yeahbut” animal again!).



It is ironic that if no one had taken the old “obvious” advice of old and debunked it, we would still perhaps be listening to it?  You see, if no one bothered proving that that old wives tale of separate indexes from table data for performance (which by the way was always an old wives tale, believe it or not, but that is another thread) was in fact an old wives tale – what would we still be doing?  If you have a conjecture based on your experiences and you show how to see the cause effect, not just state “it must be so”, why apologize.  We all know things change.  My question to Mike – without the evidence, how the heck are we going to know when something has changed? So I would really like to understand why it would be bad to show something it not true, to provide evidence – as Mike indicates it would be?



Mike closes that article with this advice



If any paper you read is older than a year or two, I suggest you take its advice with a big grain of salt as it may be applicable only to history and not current events.



Critics love to dig up old papers, presentations and advice given and use it to bludgeon people into believing they are the only authority. Take this type of advice with a grain of salt as well. In time, their advice will be referenced, out of date and out of style.



I don’t see anyone getting “bludgeoned”.  I don’t see anyone trying to be “the authority”.  I do however see a lot people having a discourse, showing evidence, backing up their conjectures with reasoning.  I personally would have written the closing like this:



If any paper you read tells you what but not why, regardless of its timestamp, I suggest you take its advice with a big grain of salt as it may be applicable only to only specific cases and not applicable to you at all.  In fact, it could be something that was true in the past and not true anymore.  If they give you no way to find out, ignore it.



I personally regard unsubstantiated advice from anyone as just that.  Unsubstantiated.   Depending on who told me, I might treat it differently.  If Jonathan Lewis, Cary Millsap or others told me something in a conversation, I would place a high level of confidence in it (I might well still test the idea, in fact I know I would).  But I would have a high degree of confidence in it.  Why?  Well because time and time again, they have backed up their comments with some logic, evidence, proof, facts.  I feel confident that prior to making a claim, they would have already tested it out.  If they hadn’t, the information would be conveyed with that fact made well known (that they hadn’t tested it, it was just an idea they had at that point).  That is why I would trust their information – but probably still try it out.  I know they do.  But anyone that just says “I’ve seen it before, trust me”, sorry but no.



The economist had an interesting story, timely one actually, that was pointed out to me [url]http://www.economist.com/science/displaystory.cfm?story_id=3809661[/url] .  My favorite comment in that article was this:



<quote>

Formal proof is a notion developed in the early part of the 20th century by logicians such as Bertrand Russell and Gottlob Frege, along with mathematicians such as David Hilbert (who can fairly be described as the father of modern mathematics) and Nicolas Bourbaki, the pseudonym of a group of French mathematicians who sought to place all of mathematics on a rigorous footing. This effort was subtle, but its upshot can be described simply. It is to replace, in proofs, standard mathematical reasoning which, in essence, relies on hand-waving arguments (it should be obvious to everyone that B follows from A) with formal logic.

</quote>



It should be obvious to everyone that B follows from A.  Hand-waving arguments.  Now, that article does not apply entirely to this topic, it is about mathematics and mathematical proofs.  That is not what we are after here, we are after the truth.  We would like to understand the cause/effect relationship between two things.  Recently, I listed some of these in a follow-up to a question on my site.  Concepts that were, sometimes still are, accepted as fact based on apparent correlations that were false positives (it should be obvious to everyone that B…)



Historically I believe false causality has been the greatest case of Oracle Myths.  "Single extents are best".  A falsehood brought about by observation based on export/import.  There was truth to be gleaned from the export/import maneuver, however the conclusion many experts of the time came to was wrong.  They  observed something – that an export with compress=y followed by an import many times results in better performance.  Obviously, it was due to a single extent (wave hands now).   



Where there other reasons?  Absolutely, and they are all valid – and fixable via other much less dramatic means.  What had to be done however was to show some evidence that – if you hold all other things constant – segments in single extents performed comparably to segments in many hundreds or thousands of extents.  That was in fact done, exposing the single extent myth for what it truly was.  The papers exposing this myth typically went on to describe what symptoms the export/import probably fixed and other methods for achieving the same goal, in a better way.



For example, perhaps the table had many migrated rows (due to an improper pctfree setting) and exporting the table, importing it corrected that temporarily.  Had we recognized the true cause/effect, we could have fixed that problem once and been done with it.  No, instead we come to the conclusion that you need to exp/imp your tables on a recurring basis to get them into single extents.  Many a script exists out there to find objects with double digit numbers of extents and “fix them”



Or, perhaps one of the indexes on the table was a “sweeping” index.  It was an index on a column populated by a sequence.  There was a process that put rows into this table.  There is another process that looks for the “oldest unprocessed” row and processes it.  For an example of this follow that link.  The export/import of the table would have recreated that index (and in fact all indexes, whether they needed it or not).  It would have solved this problem, but this problem would have been more readily solved by understanding how the data was used and rebuilding/coalescing precisely one index.



Maybe this was a case of “we purge data” and now the table contains lots of “whitespace” over time.  And, we do full scan it.  Export/Import would get rid of the whitespace.  But we could have perhaps been avoiding the whitespace issue altogether as well.



And so on.  Why is it relevant?  I mean, if the export import accidentally fixes things, that is OK right?  Well, no.  First of all, every time you take data out of the database you risk losing it and having to go to backups.  It incurred totally unnecessary downtime.  It took much longer than it could have, had we looked at the problem.  It causes lots of people to reorganize data frequently and unnecessarily.



So, does this still happen today?  Absolutely.  There is a lot of hand-waving arguments (it should be obvious to everyone that B follows from A) still.  I’d like to look at one, it has come up on my site recently in this follow-up/review.  It was talking about the ability to use multiple size blocks in Oracle 9i and up and whether indexes really “like” really large blocks.  A reference to read first would be this white paper.  It all sounds very good, doesn’t it.  I see no caveats, no drawbacks, no “your mileage may vary”.   I will be coming back to do another article, taking a look at the theory that large blocksizes are a good thing all around but that their adoption is being hindered by:



However, widespread acceptance of using multiple blocksizes has been hindered because the I/O reduction cannot be proven using simple SQL*Plus scripts on personal computers and because multiple blocksizes were originally created to support transportable tablespace



There might be a reason why their acceptance should be hindered (eg: what Don points out as a bad thing might well be a very good thing indeed).  And it could be because of false causality.  It will be a study in why apparent correlations could well be false positives.  More study does need to be done.  Wouldn’t you like to know if the perceived benefits of multiple blocksizes could be achieved in a much less intrusive way?  Or how to actually measure their benefit?  Or that the benefit that was measured actually was a benefit? (if that last statement confuses you, well, I’ll have to ask you to wait for the other paper to be completed)
In closing…



I truly believe that people wanting to provide information should back it up.  Give evidence.  Yes, I know, we are all busy people and providing the evidence is “hard”, but if you want to be a published author, if you want people to accept your advice, you have to go that extra mile.  The information you use should be real, not made up.  You’ll find you make far far fewer errors that way.  You should be able to explain everything in your findings – not most things – and you cannot just ignore things that don’t fit your hypothesis.



One of my favorite stories about this, making sure you know what you are talking about goes like this (it is true, not made up, swear to god, I have witnesses! 100 of them in fact for this one).  There was a gentleman giving a seminar. During the weeks before the seminar he had an idea to allow the attendees to ask questions they had on their minds but never had the time to adequately research and answer themselves.  Sounded like a good idea, but some of questions he received took a really long time to answer.  It was a three day seminar, lots of stuff to get ready, much preparation was necessary, time was getting short.  One of the questions was about the overhead of two phase commit.  Well, how do you describe the overhead of a two phase commit?  All you can do is set up some multi-user benchmarks in a distributed environment and check it out, measure it.  Describe how to measure it so anyone could in their environment.  This gentleman set up the test, did all of the work, collated all of the numbers, had nice findings.  But apparently he proved that a distributed transaction will have a most profound effect on LGWR.  LGWR is massively impacted by two phase commits.  The log file syncs with two phase commits are provably very long.  No two phase commit and no waiting.  He came to this conclusion because the waits on log file sync jumped from “none” to “a whole bunch”.  Obviously this is a side effect of two phase commits.



Fortunately or unfortunately depending on your perspective, there was a smart fellow in the audience.  This was the first time this material was being presented and this smart fellow followed along carefully.  As the presenter was finishing up this section and asking for questions – the following happened:



Jonathan Lewis raised his hand and asked Tom Kyte if he knew about the commit optimization in PLSQL (page down to my response from there for a demonstration of it).  Sound of hand smacking forehead.  Of course he says, that explains the difference – without a distributed transaction, PLSQL can do its commit time optimization and not suffer from log file syncs, whereas in a distributed environment that commit optimization cannot happen.  So, while the evidence was valid, under the scrutiny of peers – it was shown to be incomplete.  Had the simulation been performed with a Java/JDBC, Pro*C, OCI, VB, any other language other than PLSQL – the findings would have been different.  So, if that person giving the seminar had just written into an article or a book “It has been my observation that if you two phase commit, you should expect to see massive increases in log file sync” (because his experience was on systems with lots of PLSQL) – that information would not apply to the public in general.  It would not by true to anyone doing Java/JDBC programming for example.  But by publishing his experience, his evidence, and subjecting it to review by peers, we know now the truth, the whole truth and nothing but the truth (and yet another reason to code database stuff in PLSQL perhaps!)



And, even better, if the truth changes in the future, we have the test case waiting in the wings to tell us that.  If all we had were experts saying “do this, it is very good”, well, we would not.  Features like that array fetch of 100 rows in PLSQL happening in 10g?  I never would have known that had I not had test cases that were dramatically affected by it.  I would have still be programming with BULK COLLECT in 10g, even though it is not necessary!



Things change.



It only takes one counter case to show something is not universally true.



Rules of thumb without evidence, without ways to test them before implementing them, are dangerous, seriously dangerous.

ern 发表于 2005-5-3 17:15

Are all Oracle Scientists created equal?--Don Burleson

这篇有点人身攻击之嫌咯

Are all Oracle Scientists created equal?

Don Burleson
February 27, 2005


Back when I was a student, the web was the exclusive bastion of students, professors, and scientists.  It was customary to sign your messages with your qualifications, so everyone knew if they were talking to a Professor Emeritus or a pesky research assistant.  In scientific discourse, the opinion of a Masters Degree from M.I.T. generally had more credibility than a message from someone with a B.S. from the University of Pittsburg.

The web has evolved into a dangerous place today, and today you have to find-out who you are talking to.  In-person, when I抦 talking to Dr. Jones I know that he has a doctoral degree, and anyone who says that they are an Attorney or Engineer, must, by law, have the appropriate college degree and state license.

But what about someone who says they are an Oracle 搒cientist?  Does that tell us anything?  Cary Millsap cautions Oracle professionals against relying on people who tell you they抮e experts or people that publishers apparently trust. (He was probably talking about me!), but I agree, you need to "trust your source" and verify their academic, scientific and research credentials!  Cary is the "real-deal" Oracle scientist and I enjoy discussing Oracle with him.  He and I disagree strongly about many Oracle issues, but neither of us would ever consider resorting to mudslinging or insults, (like some others who post a link to my horse milk project or my redneck philosophy pages).

Some embrace the Oracle science movement wholeheartedly, while others disagree on the pragmatic value of the Oracle scientific movement, as this ex-Oak Table member notes:

    "The scientific minutiae, however "true" they might be, would merely confuse; the bold, sweeping statement, however simplistic, will nevertheless explain, despite the bold, sweeping statement not being technically accurate. I simplify like that every time I'm in the training room, and most times I post to c.d.o.s: but it is clearly incompatible with the pure science of Oracle espoused by the Oak Table.

    There is also a danger that one gets so addicted to that sort of science that one forgets that it is, of itself, of practical relevance to a miniscule number of people"

In my experience in legitimate science circles, no real scientist would distract from a discussion by posting arrogant, off-base, or irrelevant insults. You can instantly spot a fake Oracle scientist when they start insulting their fellow professionals: "Actually calling HJR a bastard is not an insult; it is an elevation -- just ask his mother."

I抳e noted a disturbing tendency of some Oracle scientists to arrogantly toss-out the arguments, not sharing their background or qualifications, and hurling offensive personal insults. To me, it's unprofessional and tarnishes the respectability of the database profession.  The Oracle Server UseNet newsgroup is now largely populated with profanity and nasty non-technical content.)

It's also degraded on the Oracle-owned forums.  In a recent forum discussion about Oracle's new predictive models, (I just posed an Oracle challenge on forecasting with Oracle), it struck me as very strange that the responses treated me to a discourse on the Saxon root of the word 損recepts, and one Oracle scientist felt compelled to interject my company's dress code into the discussion!  (Really, I抦 not making this up) Also, many of the Oracle scientists don't seem to understand some very fundamental scientific concepts.

Whenever I engage an alleged 搒cientist or 揺xpert in a discussion, the first thing I do is Google their academic and research background.  This gives me important insights into their mind-set, areas of computer science research, and their overall qualifications.  In my experience, all 搑eal scientists publish their academic qualifications, scientific research and whitepapers, and university teaching experience.  BTW, here is the Google syntax that I抳e automated (using the Google API) to quickly see a resume or C.V.:

   oracle "scientist_name_here" (CV | C.V. | vitae |resume | resume')

Knowing the qualifications of those who proclaim to be Oracle 搒cientists is very important.  In this world of fakers and posers, all Oracle professionals need to have strong 揃S radar and a quick Google search can tell you how much weight to give to the assertions of any Oracle 搒cientist. Personally, I check for these qualifications:

    *

      Experience at Oracle corporation - Was this person intimately involved with the internal machinations of Oracle at Redwood?  Nobody knows Oracle like the folks who built and maintain it (especially if they have source code!).
    *

      Computer Science Background - What is their academic CS background?  Were they good enough to get into a respected university science program?  Were they able to compete effectively for entrance into a competitive graduate school? Does the Oracle 搒cientist have a Masters or Doctorate degree in science?
    *

      Computer Science Research  I quickly check the ACM and IEEE archives to locate all research in any scientific journals, and use Google to find all presentations and whitepapers at computer science conventions.

For example, I was recently approached by a stranger named Scott Martin (www.tlingua.com), who made some bold assertions about creating an Artificial Intelligence (AI) engine for Oracle.  Suspicious at first, I Googled him and I quickly discovered his scientific background as an ex-member of the Oracle Kernel Group.   Plus, his Master抯 Degree from M.I.T. added legitimacy to his arguments about the benefits of predictive modeling in Oracle10g.

Ceteris paribus, I抦 much more inclined to listen to Oracle professionals who publish their credentials.  Here is just a small sample of the stellar qualifications of some Oracle professionals:

    *

      Don Bergal  Co-developer of the Confio Oracle tuning tool with an MBA from Harvard Business School
    *

      Gaja Krishna Vaidyanatha - A former Oracle Corporation OEM developer with a master's degree in computer science from Bowling Green State University
    *

      Dr. Tim Hall  Oracle author and sponsor of the hugely popular oraclebase.com - Tim is a black-belt in Karate and has a PhD in Molecular Genetics
    *

      Dr Daniel Morgan - A respected Oracle RAC instructor with a PhD from Stanford University
    *

      Cary Millsap - A respected Oracle author and former Oracle VP, with ten years at Oracle Corporation as one of the company's leading system performance experts of the Oracle Performance group.
    *

      Jonathan Lewis - Respected Oracle author with a Masters Degree in Mathematics from Oxford University.
    *

      Dr. Paul Dorsey  Noted Oracle author, PhD, and respected UML expert
    *

      Dr. Arun Kumar  Oracle author and winner of the prestigious Young Engineer of the Year award by IEEE
    *

      Colonel John Garmany  Author of six books and IEEE member (Electrical Engineer from West Point with a Masters Degree in Software Engineering).

But what about the background of other self-proclaimed Oracle scientists?  When I do a Google search for some self-proclaimed Oracle scientists, I can抰 find out much about their science background.  Gee, I can抰 even find-out if some of them ever completed high school.

For example, the Oracle Oak Table group claims to be 揳 network for the Oracle scientist, yet I was concerned when I could not locate some of their members scientific research, academic achievements, awards, computer science conference proceedings, nor their membership in professional computer science organizations such as ACM or IEEE.

Now, I抦 sure that they are just being modest, but it makes me wonder? Is a publication by an Oracle Scientist the pontification of a Harvard PhD or the ruminations of a high-school dropout?  Frankly, I抎 like to know.

I抦 hopeful that all self-professed Oracle scientists will publish their qualifications, degrees, research and achievements so that the Oracle community can give them the respect and credence that they deserve.

Reader Comments:

Dr. Amjad Daoud (OMLET) - Founder of the Stanford Information Technology
Corporation

Shut the f**k up you are old fart, like your momma's tampon did.  I recommend the following to get your s**t down to you a** degenerate old bastard man with small balls.

Michael Cunningham - Napa California

Nicely put, Don.

Well . . . I saw myself in your article - especially with regards to the thing about being an engineer. I learned something. As much as I give credit to those with college degrees I'm not willing to take from those without. I work with a degree holding "senior" dba. My feeling is when he got his degree he was put into the mindset of "I'm finished", "I got it", "now I'm done".

I have to work with this guy every day who does absolutely no proactive reading or research and is still stuck on 8.1.7 technology. When he has a question about 9i or 10g he, "the senior dba", comes and asks me. I am in charge of maintaining our disaster recovery site because I "Read the F**king Manual". He was still using a technique use with version 7.3. I know there are many who don't take this "back seat" to learning after obtaining a degree (just like you, Mike Ault, Col John, and others you have mentioned). One the reasons I spend so much time on the forums learning and trying to participate is to compensate for the lack of mentorship I don't receive at work.

I would expect good leadership would see that as a sign of a self starter. Someone who can see where there is a need, learn it, and implement it. A college degree does not guarantee that, although it does give someone the edge over someone like me. It sounds more like a union workplace. The one with the seniority gets the promotion - qualified or not - over the less senior person who might really know his stuff.

I've been in this debate before about degrees and all I really have come up with is that I really do wish I would have completed my degree. And even without a degree I'd still hire people like Bill Gates, Steve Jobs, Walt Disney, and Dave Thomas (the founder of Wendy's who got his GED at the age of 59).

I agree with you about credentials and the paper degree. I'm not, however, willing to discount the efforts of someone without the paper.

Shushil Patel - New Jersey

Are you an Oracle scientist?

Personally, I've seen people at parties introduce themselves as a "scientist", and they start asking about their areas of research.  Oracle is a software package, not a science.  I publish my resume online, and I have some qualifications that might be considered scientific, but not nearly to the level of a real scientist:

    *

      Nine college classes in Calculus, multivariate statistics and decision sciences, from a 4th-tier, mediocre state university.
    *

      Masters Degree (MBA in IT), and 15 years as an adjunct professor, teaching over 100 grad school courses in computer science and IT.
    *

      Published in legitimate science journals (Information & Management), and presented whitepapers at several science conferences (IEEE, etc.).

When I introduce myself I tell people "I work with computers".

Kent Crotty - Oracle consultant

I would have to agree with Don.   To lend any legitimacy to arguments made by any 憇cientist, that individual must have the backing of research and qualifications.    The very definition of the word demands this:

[url]http://www.webster.com/cgi-bin/dictionary?va=scientist[/url]



This new Internet world also demands it.   The Internet provides us the ability to quickly communicate over vast distances with scientists we possibly will never meet face-to-face.   In order to trust or believe in the arguments set forth by a proclaimed scientist, that scientist must also be able to present credentials.   Credence for scientist can only be found through published qualifications, research or achievements.    The scientist must present these on the Internet for all to see.  

I, myself, do not proclaim to be a scientist but I do feel that I am educated and experienced.  I present my credentials through my resume`.  Shouldn抰 scientist do the same?   Presently I am working towards becoming a scientist  an Oracle scientist.   When I want to be called a scientist, I will publish my credentials in order to legitimize myself and the place that I will publish is on the Internet for all to see.

From Anonymous:

You are bing (sic) degrading to scientists.

I find some peoples label as self-proclaimed "scientist" of be offensive to all legitimate scientists.  To me, titles convey qualifications and calling yourself a scientist denotes some serious academic qualifications.

Professor Jones is a teacher, Dr. Smith has a doctorate and John Doe, esq. has a law degree.  Try introducing yourself as "I'm a scientist" at a cocktail party. They will immediately start asking about your research areas and publications.

Harrison Conway - New York

When I started reading this I though it was another of your funny jokes.  The idea of someone who thinks their usage of a software package warrants them being called a scientist is ridiculous.  Maybe I'm a Quicken Engineer and I didn't even know it.

Did you see the lawsuit against Microsoft for creating the title of Microsoft Certified System Engineers?  A Microsoft Engineer is as stupid as an Oracle Scientist.

No, I did not know about that lawsuit.  Thanks.

[url]http://www.peo.on.ca/enforcement/Quebec_MS_April2004.pdf[/url]

The OIQ had charged Microsoft Canada for knowingly causing a person who is not a member of the Ordre des ing閚ieurs du Qu閎ec, by authorization or encouragement, to use the title of engineer, thereby committing an offense under section 188.1 of the Professional Code, R.S.Q., c. C-26.

[url]http://www.eetimes.com/news/98/1026news/debate.html[/url]

Complicating the software-licensing controversy is the fight between the national licensing boards and industry over the proliferation of "certified engineers." Microsoft and Novell, in particular, have been promoting training programs that "graduate" Microsoft-certified or Novell-certified "engineers."

The State of Delaware, according to IEEE-USA, is eyeing a lawsuit to stop the use of the word "engineer" in those titles. Indeed, Novell and the state of Nevada licensing board have gone to court over Novell's use of the word "engineer" to describe trainees who may or may not have an engineering degree.

Edward Stoever - California

I wanted to post a response, but I am still thinking of what to say! My father is a retired superior court judge. He has a certain public image, which is incredible. Then, those of us who love him the most know his private, human side, which is a little different.

To me, you are a lot like my father in that way. Since the time I first heard of you, I have seen your public image: suits, ties, lots of books with excellent information, supporting others who are moving up in their careers.

You have done an amazing job with your career and your public image. I am still very much in awe.

I took a look at your redneck page the other day.  It's nice to know you have a humorous personal and private side. It makes getting to know you more interesting and fun. Just don't let the cat out of the bag.

Bipul Kumar - London

How is the response of your "Predictive modeling challenge"? From the Statistics perspective, it will be very interesting. I think Tom went a bit too far in critising the idea on his forum.

I'm getting GREAT entries so far, and the proof will be how-well they make real-world, verifiable predictions!  To see Oracle prediction tools, see Oracle data mining (ODM tool), v$db_cache_advice, and the SQLAccess advisor.

ern 发表于 2005-5-3 17:18

In scientific discourse, the opinion of a Masters Degree from M.I.T. generally had more credibility than a message from someone with a B.S. from the University of Pittsburg.
这一句中B.S. from the University of Pittsburg就是Thomas Kyte了,呵呵。下面是Tom的原文:
Well, I'm just a poor math major with a BA degree from the University of
Pittsburgh (now no one has to wonder why Don picked that school in his "are all
scientists created equal"

[url]http://dba-oracle.com/oracle_tips_credence_qualifications.htm[/url]

pitch :)  Abstract Algebra.  It was four years of
mathematical proofs.

Guess the prove it to me thing just comes with the territory.

Not so sure I agree with your judgements of higher education in the US, but that
is a subjective feeling on my part (not really qualified to comment, so I
won't).

ern 发表于 2005-5-3 17:34

来自AskTom

Why wouldn't I respond?  April 09, 2005
Reviewer:  Mike (No alias) Ault  from Alpharetta, Ga

To the gentleman who said "Good luck getting Mike to respond" why in the world
would that be said? As Tom knows I readily respond, both to email and when I am
made aware, on forums. Unfortunately I don't track daily in all forums
everywhere to see who is saying what about me. Life is too short to get
flustered over childish behavior in others.

I agree with the scientific method. However, one of the foundation stones of
that method is that the test must as close as possible mirror the system that is
being tested. That is the number one issue I have with single user, small
database tests. A single user, small database test cannot be used as proof on a
large multiuser database.

I tend to use the TPCH database when I perform testing (for example, does RAC
perform better than multi-CPU systems, does SSD out perform SCSI arrays, what do
configuraiton changes in SSD/SCSI do to performance?) I also perform testing at
the component level (How does hdparm affect disk performance in Linux?) All of
these tests, the scripts used, the results and timing data are readily available
in the various books I have recently published.

In my books, I provide scripts, results and explanations. Anyone can repeat any
of my tests on their systems. I was one of the first to document the issues,
using scripts, that befall the shared pool and how to correct them. I also was
one of the first to show how to look into your buffer cache to see how the
blocks were actually used and not depend on such hand waving as hit ratios.

Before you paint everyone with that brush in your hand, better review the facts.
Like Tom, I have given bad advice, been called on it and have apologized and
retracted as required. That is what happens when you put yourself out in the
public domain. No one is right 100% of the time.

When I suggest something, I provide the background data on why. It may be a
reference at the end of the article or a link. In this world, you take good
advice wherever it can be found and apply it. I even take occasional tips from
Rogers and Foote and use them.

Yes, experts will disagree, on tips, on methods, on usage. I try not to make it
personal if I can. However, I am human and at times may do things contrary to
what I should, don't we all.

At the same time Tom is doing his answers while watching the presentations, I am
proofing the latest books, answering client calls, helping other DBAs and
traveling nearly 80%. Afraid it doesn't leave much time for generating pages of
proofs except when they are germane to my current tasks. I don't get paid a
salary, only hours worked as a 1099 consultant, each hour I spend answering
questions from non-clients, writing presentations for users groups, articles for
online and paper publications and trying to deal with online attacks takes away
from my family and job and incidently, from time that could be spent testing and
learning against the Oracle database.

One thing to remember, the real questions (other than those from folks that
don't know where to look for answers, or who don't want to look for answers)
usually don't make it to the forums, to TARs to asktom until they have been
beaten to death by (or beat to death) guys like me attempting to solve the
issues often against a deadline.

Anyway, enough said.




Followup:

Mike,

While I got your ear.

Regarding your recent "IMHO" article. (oh cool, I see it is now "a warning" to
all that enter:
<quote src=burleson

[url]http://dba.ipbhost.com/index.php?showtopic=1473#entry5574[/url]

>
Good answer. Mike also has important warnings about the askom site:
</quote>
) watch out all, you might get hurt here now.  



In that "warning" you start with a statement about "credible"

<quote>
It's can be very difficult to get credible information from some of the
Oracle-owned forums such as  asktom.oracle.com.
</quote>

would you care to elaborate on that at all?  Seems the article has nothing to do
with credibility at all?



It does however seem to be a reprint of a bunch of email Don sent us here at
Oracle (pretty much point for point).  Now, I know that Don would never "bash"
anyone in public, rather -- he wants others to do it for him (picked that up
from the proof thread where he used an anonymous 3rd party DBA with 30 years of
experience) so I'm sure he is thankful you provided him the outlet to do that.



The whole "extra cost Oracle book" part in there was funny.  (what, you don't
write and expect to get compensated for it?  Yeah, I'll get right on that free
book deal...).  (Oh, but I did point the reader to a slew of free books, over
10,000 pages of free books, but ship them one?  I don't think so, no)


You see, this forum here is not moderated.  I do not remove posts that just
express opinions.  I do not review material prior to its release (I do not
review or even see all of the posts).  When asked in the past, I've actually
removed from view posts at Don's request.  

As for
<quote>
the asktom web site is written in HTML-DB and it would be a simple code change
to require positive identification of respondents
</quote>

you were only kidding that it would be a simple code change to require
positive idenficiation of respondents right?  Come on, as a technologist
yourself you must know how hard that would be.  Even Don alludes to it:


[url]http://dba.ipbhost.com/index.php?showtopic=1396&st=30#entry5113[/url]

when one of your own users demonstrated how trivial it is to become anyone they
want on your forums.  If you know of a simple code change that could
"positively identify people", you would rich many times over. (it was also very
curious to me that both you and Don posted many times after the alleged
"identity theft", many times... 8 times in fact).  So no, I didn't expect you to
constantly monitor the forum, I already knew you were doing so...


and thanks for not using IM speak in your followup, much appreciated.  (funny
how the people using IM speak do in fact appear to otherwise have a full command
of the english language, and the ones that do not have the command of the
english language don't seem to use it -- as they haven't learned the language
enough yet to figure out how.  Those (without the command of the language)
people are typically the ones most grateful that I make fun of IM speak since it
makes reading really really hard for them)

I know Don had the list ready for you to go based on his email from Feb this
year <quote>"There are people who feel that the tawdry c**p that you allow to be
published on "asktom" makes all of Oracle Corporation look bad (e.g. "I think
your vowel keys are stuck", etc., I have compiled an 8-page list)."</quote>

页: [1]

Powered by Discuz! Archiver 7.0.0  © 2001-2009 Comsenz Inc.