Category Archives: DBMS

Test your SQL

SQL Fiddle is the place to test your queries and database design. You can simply share the links with other team members to let them know what you are thinking. It provided a number of database platforms like Oracle, Mysql, MS Sql, Postgres etc.

Installing postgres 9

I was trying to install postgres 9 on my Ubunto 11.04 machine, but the GUI installer simply hanged. A little bit of googling provided the workaround.

Follow the steps

1. sudo apt-get install python-software-properties
2. sudo add-apt-repository ppa:pitti/postgresql
3. sudo apt-get update
4. sudo apt-get install postgresql-9.0 libpq-dev

Source
http://www.dctrwatson.com/2010/09/installing-postgresql-9-0-on-ubuntu-10-04/

Isolation levels of a database

In my last post I talked about ACID properties of the database. Out of these, Isolation is one of the most important (and a bit complex) property as this makes sure that no “dirty-ready” occur. There can be various levels of database isolation.

1. Read Uncommitted: Lowest level of isolation, does not implement any checks, a transaction can read while other transaction is writing and hence dirty read can happen.

2. Read Committed. Reads only committed data. Handles dirty read problem. But say in same transaction, the data for a row in table is read multiple times, and if the data gets updated by some other transaction and committed, between the gap period of the two reads in this transaction, the transaction has read different data at different time intervals for same row.

e.g. select name from employee where id=10;

— do something which takes time

select name from employee where id=10;

commit;

The data can change between the two reads

3. Repeatable Reads: This solves the problem mentioned above with the Read Committed. That is, this level does not allow any modifications or deletion in the data which is being accessed by some other transaction. So in last example, if we read the same data multiple times, it is guaranteed to return the same info. Though we cannot update or delete the data in table being accessed by a transaction, we can still add more data. So if the query was

select name from employee;

instead of select name from employee where id=10;

we still could have different data set.

4. Serializable: This is the highest level of restriction added on transactions. This states, that no updation, deletion or addition can happen on a dataset/ table which is being accessed by another transaction. This will solve the problem which is mentioned in previous example. But the solution comes at a high cost as this restriction level will slow down the rate of transactions being executed.

ACID test for your database

ACID is a set of properties that each database system needs to guarantee in order to make sure that all the transactions will be processed reliably

Atomicity: A transaction is always committed as a full. That means if the transaction had 10 statements, all 10 are executed fully and committed. A partial execution of transaction will never occur.

Consistency: Database state is always stable. So if a transaction moves database from state A to B, both states are stable.

Isolation: All transactions will execute in isolation and do not interfere with one another (one transaction is updating the data and other is reading at the same time – dirty read is handled).

Durability: Once a transaction is committed, the changes will not be reverted by some hardware or software failure.

Forgot MySql password?

Forgot the root password for mysql database. Here are easy steps to recover

1. Restart MySql database without password in safe mode
/etc/init.d/mysql stop
mysqld_safe --skip-grant-tables &

2. Change root password 
mysql -u root
mysql> use mysql;
mysql> update user set password="mypassword" where User='root';
mysql> flush privileges;
mysql> quit

**please note that the password might be in encrypted form. use PASSWORD('some_pass') method 
mysql> update user set password=PASSWORD('mypassword') where User='root';

3. Restart database
/etc/init.d/mysql stop
/etc/init.d/mysql start

References: http://www.cyberciti.biz/tips/recover-mysql-root-password.html

Taking Mysql Backups

If you need to take a backup of myql database in sql format, it is pretty simple

mysqldump -u myDatabaseUsername -p Password >backup.sql

You will get the backup.sql file which you can restore anywhere

mysql – u myDatabaseUser -p -D databseName < backup.sql or simply execute file from the mysql prompt @backup.sql References: http://www.roseindia.net/mysql/mysql_backup_restore.shtml

For creating user and providing privileges
http://www.debuntu.org/how-to-create-a-mysql-database-and-set-privileges-to-a-user

How does database (oracle) joins handle filter condition

I was wondering how does database interpret join queries. When we create a join between 2 tables in Oracle, with some additional filter condition on one or both tables, will oracle join the tables first and then filter or will it filter the conditions first and then join.

For example I have 2 simple queries

Query 1:

select e.name, d.name from employee e, department d
where e.dept_id=d.id and e.salary>50000;

Query 2:

select e.name, d.name from
(select * from employee where salary>50000) e, department d
where e.dept_id=d.id;

Now which is a better option?

I posted my doubts on a couple of forums and did some googling, and this is what I found

“Both are same”

An easy way to check this out id to use “explain plan for” clause, which tells you how actually oracle interprets your query after optimization.

explain plan for
select e.name, d.name from employee e, department d
where e.dept_id=d.id and e.salary>50000;
select * from table(dbms_xplan.display());

output was same as

explain plan for
elect e.name, d.name from
(select * from employee where salary>50000) e, department d
where e.dept_id=d.id;

Abusing Foreign Keys

Recently I came across a database design which made me think that upto what extent one should be using Foreign keys. A colleague created a new table in an existing schema and used maximum number of foreign key constraints possible to make sure no junk data gets entered into this table. On one hand it is a good idea to use foreign key constraints to make sure your data is clean all the time, on the other hand it might introduce some unwanted conditions on your code.

Let me take an example. You have 2 tables employee and salary. Employee has a primary key employee_id which you use as foreign key in your salary table. Sounds smooth, if we have salary entry, we must have the employee, and if employee is moving out(record getting deleted), delete the salary entry. Lets say I have another table called print_history which stores who all has printed salary slips for an employee. Now this table has 2 employee Ids, one for the manager who printed the slip (say Manager_id) and second is the employee whose slip was printed (say emp_id). Now we know that logically these 2 Ids should be in employee table, but is it necessary to add the constraint. It will work fine while insertion of the records, but say there is a case when an employee record is getting deleted, should we enforce deletion of print_history record? should we stop employee record from being deleted if there is an print_history entry? (I know soft deletes might help, but still, why to add unnecessary  constraints?

Deleting duplicate rows from a table

Recently faced an issue of having duplicate rows in a table of production database. Normally to delete or insert data into table, primary key is used, but due to bad database design somehow duplicate rows got added. Exact same data, each column was same.

A little bit googling solved the issue.

delete from temp_table where rowid not in (select min(rowid) from temp_table group by column1, column2);

Note that this might not straight away work for a table with huge data. A simple way to solve that issue is to create a temporary table, say temp_table2, copy data for duplicate rows from temp_table to temp_table2. Now clean up temp_table2 using above delete command and delete all the duplicate rows form temp_table. Insert back all the remaining, cleaned up, rows from temp_table2 to temp_table.