Oracle Strange String Comparisons
When Oracle Pads the Numbers
Granted, this is more of a functionality issue, than performance, but it came-up during a performance analysis, and it surprised me in a big way. I bet a lot of DBAs are surprised by this. Just for fun, I took a survey of the dozen DBAs in my group. Not one DBA knew about this.
What is the Problem?
One would intuitively think that, when comparing strings, the strings are only "equal" if the two are absolutely identical. In Oracle, however, this is not true. What? It's true--the two strings do not need to be identical. This can lead to confusion and surprising results.
The key is what Oracle calls, Blank-Padded and Non-padded Comparison Semantics. Let's take a look at an actual example, and see how this plays out.
Getting Bind Info
I was assisting a developer in trying to improve the performance of a particular application query. It was critical that I know the exact parameters that were used. I grabbed the bind values using this script below, so I was sure I was using the correct parameters:
Select distinct a.inst_ID,
to_char(last_captured, 'mon-dd-hh24:mi') cap,
c.name||'/'||c.value_string bind_var
From GV$sqlarea a,
Dba_users b,
Gv$sql_bind_capture c
Where b.user_id=a.parsing_user_id
And b.username != 'SYS'
And c.address=a.address
And a.sql_id = [TBD]
Order by 1,2;
During my testing, I always got 0 rows--but that's not what the result was in the production; so it seemed as though my analysis was wrong.
Essence of the Problem
The Sql I was analyzing was pretty long, but I boiled the problem down to just one clause. In this particular clause, I always got zero rows--contradicting what the application actually did in production. Here's the gist of what I was doing:
Var b1 VARCHAR2(10)
Exec :b1 := 'DIV_YS';
Select count(*) from Detail_Table Where LEVEL_2_CODE = :b1;
COUNT(*)
--------
0
Change the Data Type
I'm not sure why I thought of this, but I changed my variable type to CHAR instead of VARCHAR2. I then repeated the query:
Var b1 CHAR(10)
Exec :b1 := 'DIV_YS';
Select count(*) from Detail_Table Where LEVEL_2_CODE = :b1;
COUNT(*)
--------
10
What! Why did the answer change? I discovered that in Oracle, my code yields a functionality difference, depending on how exactly I define my bind variable. When I switched to CHAR, Oracle automatically switched to blank-padded comparison semantics. With this method of comparison, many rows in the table matched.
Let's take a closer look at this.
Comparison Semantics
With blank-padded semantics, if the two values have different lengths, then Oracle first adds blanks to the end of the shorter one so their lengths are equal. Oracle then compares the values character by character up to the first character that differs. The value with the greater character in the first differing position is considered greater. If two values have no differing characters, then they are considered equal. This rule means that two values are equal if they differ only in the number of trailing blanks.
So, for a CHAR field, 'A' = 'A '
With blank-padded semantics, if the two values have different lengths, then Oracle first adds blanks to the end of the shorter one so their lengths are equal. Oracle then compares the values character by character up to the first character that differs.
This is documented in the official Sql reference guide: Oracle Database SQL Language Reference 11g Release 1 (11.1)
"Oracle uses blank-padded comparison semantics only when both values in the comparison are either expressions of data type CHAR, NCHAR, text literals, or values returned by the USER function."
This rule means that two values are equal if they differ only in the number of trailing blanks. That's exactly my situation in my example above.