Cogito, ergo sum

Here's hoping my musings can help you out!

Archive for September, 2005

Saturday Reading-Digital Fortress

Posted by mnshankar on September 18, 2005

Have a (really bad) sore throat – Lots of fleeting thoughts as I lay in my couch. man, if a sore throat can cause so much weakness and fatigue, I dread to think what a serious ailment would be like.  There is profound meaning to the statement “Health is wealth“.. There are so many things that one takes for granted. I guess you just have to thank your stars for your good fortune.

Anyways… I stayed at home all day long. Picked up a copy of ‘Digital Fortress’ by Dan Brown that I have wanted to read for a long time.
It was quick, light reading. Took me about 6 hrs maybe (cover to cover).
Is it just me or is there an uncanny resemblance between the ‘digital fortress’ and ‘Angels and Demons’ in the central plot? In both the stories, Dan builds up the suspense about who the actual ‘bad guy’ might be. I could easily surmise who it was in this case.

The whole concept of a ‘monster’ computer scanning every packet on the Internet did not quite make sense.

  • The search for the ring took up about 2/3 of the book and, it turned out to be unnecessary!
  • The concept of all data in the universe being ‘on wire’ all the time so the Translatr could snoop is absurd
  • Dan goes into elaborate detail to build up his case that the brightest and best minds work for the NSA. But in the end when it really mattered, a freaking linguist figured the whole thing out! They must have easy jobs in the NSA!

Nevertheless, it was what it was: a signature thriller from Dan Brown.

Later.

Posted in Personal | Leave a Comment »

Codds Twelve Rules

Posted by mnshankar on September 16, 2005

Codd’s twelve rules for a system to be classified as a (fully relational) DBMS

  1. Information Rule – Information is to be represented as data stored in cells
  2. Guaranteed Access Rule- Every row must be accessible using a combination of Table name+Primary key+Column name
  3. Nulls must be used in a consistent manner -Irrespective of the datatype of the column
  4. Metadata should be stored as relational tables and must be accessible using SQL constructs
  5. The data access language must provide all means of access and must be the only means of access
  6. All views that may be updateable must be updateable
  7. There must be set level insert, updates and deletes – Here is an example of a set level delete:
    delete from payroll_transaction where transaction_id in
    (select
    transaction_id from #t_EmpPayrollTrans)

  8. Physical data independence
  9. Logical data independence
  10. Integrity rules should be stored in the data dictionary
  11. Distribution independence
  12. Non subversion rule
  13. (or Rule 0) The system itself must use its relational facilities exclusively to manage the database

Posted in Database | Leave a Comment »

RDBMS Joins

Posted by mnshankar on September 16, 2005

The capability to link or join tables and generate one resultset from multiple tables is one of the most important characteristics of relational databases. A summary of join types is presented:

Inner join : For every row in the first table, SQL server goes through the second table trying to find a match based on the join column. Complexity of this algorithm= O(mn) (m= number of rows in first table, n=number of rows in second table)

Outer Join: An OUTER JOIN operation returns all rows that match the JOIN condition, and it may also return some of the rows that don’t match, depending on the type of OUTER JOIN used. There are three types of OUTER JOIN: RIGHT OUTER JOIN, LEFT OUTER JOIN, and FULL OUTER JOIN.
A RIGHT OUTER JOIN operation returns all matching rows in both tables, and also rows in the right table that don’t have a corresponding row in the left table.
A LEFT OUTER JOIN returns the rows from the left table that don’t have a corresponding row in the right table.
A FULL OUTER JOIN returns an intersection of a Right outer and left outer joins. A FULL OUTER JOIN operation returns:

  • All rows that match the JOIN condition.
  • Rows from the left table that don’t have a corresponding row in the right table. These rows have NULL values in the columns of the right table.
  • Rows from the right table that don’t have a corresponding row in the left table. These rows have NULL values in the columns of the left table.


Cross join – Cartesian product of two tables. A Cross join with a condition can translate into an inner join

Self Join – A table joined with itself. Two copies of a table are created and merged.

Posted in Database | Leave a Comment »

RDBMS Index Facts

Posted by mnshankar on September 16, 2005

Index pages make it possible to directly access any row in a table. Remember that Indexes are not required. You can query and manipulate data without an index. However, data access is considerably faster when appropriate indexes are available (A good database engineer takes pride in efficient access structures)
Every column in a table can be indexed – but a decision has to be made keeping in view the additional overhead that index maintenance can present.

As a rule, I create indexes on the following:

  1. Primary keys
  2. Foreign keys /keys used in joins
  3. Columns that occur frequently in SQL ‘Where’ clauses
  4. Columns in SQL ‘order by’ or ‘group by’ clauses

Types of Indexes – Clustered and Non clustered

A clustered index physically orders rows in a table. So, you can only have one clustered index on a table.
Non clustered indexes use storage location information in index pages to navigate to data pages (if clustered indexes exist). If a non clustered index is built on top of a ‘heap’ (a table without a clustered index), internal rowID’s are used to point to the data on disk.

Only if you frequently process ranges of values – for example: WHERE key BETWEEN 10 AND 10000 and/or sort on key value would a clustered index on that value be worthwhile.

Note that in MySQL, the column(s) that you define as primary key automatically becomes the clustered index. So, if you have a surrogate key (based on an autonumber) on a table, the DB engine automatically creates a clustered index based on that column. Since you can only have one clustered index on a table, this kinda sucks!

In MS Sql server on the other hand, you can have a table that has a non-clustered index on a primary key and a clustered index on a field that is involved in range queries/sort (rock on!)

Key values in a clustered index MUST BE UNIQUE. If the Unique keyword is not specified explicitly, uniqueness is enforced by an internal identifier (that is inaccessible to the user).

Do not use clustered indexes if you don’t have to. When adding a new row to a full ‘data’ page, the RDBMS system does a ‘page split’ by moving approximately half the rows to a new page to make room. Page links in the index and data pages then need to be updated to maintain the physical sequence of records.
Page splits never need to occur in heaps, as pages are not linked (by definition, the pages are not in any order)

Composite Indexes
An index that is created on more than one column on a table is called a composite index. An index on (column1, column2) is NOT the same as an index on (column2, column1). Define the most unique/most selective column first.
Composite indexes are useful for tables with multiple column keys. For example, in a telephone directory, a composite index on (last name, firstname) can speed up searches.

Note that if the WHERE clause of a query references a lower order column of the composite index, it must also reference ALL higher order columns defined in the index for the query optimizer to use the composite index.
For example, if I have a composite index on (last name, first name) on Employee_Table,

  • Select Phone from Employee_Table where Last_Name=’blah’ (Composite index is used)
  • Select Phone from Employee_Table where First_Name=’blah’ (Composite index is NOT used)
  • Select Phone from Employee_Table where Last_Name=’blah1’ AND First_Name=’blah2’ (Composite index is used)

    

Posted in Database | Leave a Comment »

Friday Night Movie

Posted by mnshankar on September 10, 2005

I watched ‘The constant gardener’ last night.

Basic plot:

Drug companies in collusion with the British government use people dying of aids in Africa to test a new TB vaccine.. The formula has not been perfected and it would be a huge loss to get back to the drawing board – as is required by normal human testing protocols.

The director includes a lot of ‘personal connection’ between the protagonists in the flashback scenes. Intrigue, suspicion, love are all well packaged.

My Impression:

It was a good movie.. but ‘Extreme Measures’-a Hugh Grant movie with a very similar plot was more chilling.

The Title:

I thought about why this movie was named ‘The Constant Gardener’. My best guess is that Fiennes was shown for the first half of the film as a soft guy who lives by the rules and spends most of his spare time gardening.
After his wife’s death, he was determined to finish the work she started and emerged the hero.

Have a great weekend folks!
Later…

Posted in Personal | Leave a Comment »

Politics, Death and Mayhem

Posted by mnshankar on September 9, 2005

Katrina
Our dear president has sold to the gullible public the notion that we should give him carte-blance authority to spend countless millions of dollars and put hundreds of thousands of troops on foreign soil to keep the US safe in the euphemistic ‘post 9/11 world’.

Whether a crisis is unleashed by man or by nature, it is the responsibility of the government to be there. The government failed us on 9/11. It again failed us when katrina struck.
So, while president bush was enjoying his month long vacation, people along the gulf coast were sinking. FEMA failed to act in a timely manner and that is the undeniable truth.
“It’s as if the entire Gulf Coast were obliterated by the worst kind of weapon you can imagine,” Bush said.
I guess the idea of making that statement is that everything has to go with his new mantra of ‘war’ and ‘WMD’.. Good Lord!

Anyone who watches the discovery or CNN channel knows that New Orleans needed a massive surgery to prevent internal flooding in case of a catastrophe like this. The leaders did not think it necessary. For the president to say that he did not anticipate this calamity is disgraceful to say the least.
Ofcourse it was necessary (and urgent mind you) to arrest the proliferation of ‘weapons of mass destruction’ and ‘liberate’ people in other countries. Seriously .. WHAT THE F*** IS GOING ON?
Just curious.. can a president be impeached for being stupid?

John Roberts

Now that Mr. Rehnquist is dead, Bush has recommended that John Roberts, originally nominated to replace retiring justice Sandra O Connor, be the new Chief Justice of the Supreme Court.Dont get me wrong.. I dont doubt the intellectual prowess of John Roberts – The man may be a genius.. extremely well educated, politically savvy etc… It just seems that in a ’sue-happy’ country like the US, someone with more than 2 years experience at the bench would be a better choice to head the supreme
court!
Can a third grader be the president of the United States? (some may answer yes to this question given the quality of our current leadership.. but that is besides the point :-)

Everyday, US seems to be getting deeper into shit.

If I as a foreigner feel so bad for the US, how must citizens of this land feel?
If this is not incompetence, what is it?

Later…

Posted in News and politics | Leave a Comment »

RDBMS Internals

Posted by mnshankar on September 7, 2005


A modern RDBMS is quite a complex beast. Here is a ’simplified’ block diagram of the components involved. Pretty darn good at abstracting the details from the end user. You dont really have to know how each ‘module’ fits into the entire scheme of things.. but it definitely helps.
For instance, you get to know that when you ‘edit’ a single ‘field’ in a ‘table’ in a ‘database’, the ‘entire page’ on which the data exists is written to disk – because a memory ‘page’ is the smallest unit of IO for a database engine.

In my professional career, I have studied a lot about databases and used almost about every product out there – dbase, foxbase, foxpro, Access, filemaker, Approach, MySQL, SQL Server, Oracle, Sybase, DB2…..

But I must admit nothing blew my mind like Microsoft’s SQL Server. It is a GIANT product with extremely intuitive and functional client tools. MS has always been about great user experience.. be it windows or office or visual studio or (sql server) enterprise manager. The ability to ‘trace and debug’ stored procedures.. priceless! They really have thought of EVERYTHING.

Later…

Posted in Database | Leave a Comment »