February 08, 2011

C# Index Search

Csharp Index Search
C# index searching instead of SQL SELECT

Summery
Sometimes, i heard something about slowest searching on some databases.
For example last night, i heard about a famous company have an oracle database with one billion records in one table and their software have a HUGE LATENCY for a simple compare search (WHERE Cluse) and they wait more than 15 minutes for results.
Everybody's know that's ridiculous for a DBMS like Microsoft SQL or Oracle about 15 minutes on a simple query on tiny database and it's absolutely comes from bad configuration and they need to have a DBA (Database Administrator) instead of another DBMS or Software ...

Solution
Anyway, i developed a simple and simple indexing for demonstrate power of indexing in search processes. I was shocked after coming results

Test Machine:
CPU: Intel Core i5 750 @ 2.67GHz 2.73Ghz
RAM: 4.00GB (3GB useable)
OS: Windows 7 32-bit
Database: Microsoft SQL Server Compact 3.5
Application: developed from 11PM till 6AM just for fun :-)

Search Times:
on 100,000 records = 0.24s (SQL) vs <0.01s (Index)
on 200,000 records = 0.49s (SQL) vs <0.01s (Index)
on 500,000 records = 1.92s (SQL) vs <0.01s (Index)
on 1,000,000 records = 2.46s (SQL) vs 0.02s (Index)
on 1,500,000 records = 5.91s (SQL) vs 0.04s (Index)


Conclusion
From results was shown, search on index files, with the simplest algorithm, are more than 100 times faster than simple SQL QUERY, in normal situation, with any normalization and indexing, on Microsoft SQL CE database.

Appendix I: Screen Shots



Appendix II: Source Codes
You can see the process of make a hundered index file base on two charater of start 10 digits phone numbers. for example if the phone number is 3473380789 then it's store in array[34] and then store in index34.inx file as one of index files.
/// loop all data
ArrayList[] Array = new ArrayList[100];
SqlCeConnection cn = new SqlCeConnection(_ConnectionString);
if (cn.State == ConnectionState.Closed) cn.Open();
string query = "SELECT Phone FROM People";
SqlCeCommand command = new SqlCeCommand(query, cn);
command.CommandType = CommandType.Text;
SqlCeResultSet record = command.ExecuteResultSet(ResultSetOptions.Scrollable);
int RowCount = 0;
if (record.HasRows)
{
    record.ReadFirst();
    do{
        string Data = record.GetString(0);
        int Index = Convert.ToInt32(Data.Substring(0, 2));
        if (Array[Index] == null) Array[Index] = new ArrayList();
        Array[Index].Add(Data);
        RowCount++;
    } while (record.Read());
}

/// Create all index files from arrays
XmlSerializer Serializer = new XmlSerializer(typeof(ArrayList));
for (int i = 0; i <= 9; i++)
{
    for (int j = 0; j <= 9; j++)
    {
        int Index = i * 10 + j;
        Serializer.Serialize(new StreamWriter(_IndexFileName + Index + ".inx"), Array[Index]);
    }
}
and the find result is so simple than making index files
/// Read all index files
XmlSerializer Serializer = new XmlSerializer(typeof(ArrayList));
int Index = Convert.ToInt32(SearchItem.Substring(0, 2));
ArrayList array = Serializer.Deserialize(new StreamReader(_IndexFileName + Index + ".inx")) as ArrayList;
String Message = "NOT FOUND !!!";
if (array.BinarySearch(SearchItem) == 0) Message = "*** FOUND ON INDEX ***";
I think that it's clear about SQL WHERE CLUSE search but i put coeds here for compare
/// Read all index files
SqlCeConnection cn = new SqlCeConnection(_ConnectionString);
if (cn.State == ConnectionState.Closed) cn.Open();
string query = String.Format("SELECT * FROM People WHERE Phone = '{0}'", editPhone.Text);
SqlCeCommand command = new SqlCeCommand(query, cn);
command.CommandType = CommandType.Text;
SqlCeResultSet record = command.ExecuteResultSet(ResultSetOptions.Scrollable);
if (record.HasRows) Message = "*** FIND ***";
else Message = "No record found!";


Appendix III: Downloads
SourceCode: Here
Binary: Here

No comments: