Category Archives: DBMS

concatenate strings in different rows of a recordset

How to concatenate strings in different rows of a recordset, using a common id?

Example

I have a table (test123) like this

IdValue DataValue
———————
1 aaaaa
2 bbbbb
1 ccccc
2 ddddd

Desired output

IdValue Data_Concatenated
———————-
1 aaaaa,ccccc
2 bbbbb,ddddd

Solution 1: Using sys_connect_by_path

SELECT idvalue ,
SUBSTR(MAX(sys_connect_by_path(datavalue, ‘,’ )),2) data_concatenated
FROM
(SELECT idvalue ,
datavalue ,
row_number() over (partition BY idvalue order by datavalue) row#
FROM test123
)
START WITH row# = 1
CONNECT BY prior row# = row#-1
AND prior idvalue = idvalue
GROUP BY idvalue
ORDER BY idvalue;

Solution 2:

SELECT idvalue, RTRIM (EXTRACT (XMLAGG (XMLELEMENT (“X”, dataVALUE || ‘,’)), ‘/X/text()’),’,’) VALUE
FROM test123
group by IDvalue;

Stored procedure Vs SQL in code

Originally Posted December 24, 2007

The debate is old. But every now and then, we have to make a decision between using stored procedures and writing sql queries in code. When you are implementing sql queries for your database, you can either write them down in your code (again you can have different layers in your code and you can decide where to implement your sql queries, but that is different topic) or you can store them in database itself as stored procedures.

There are two schools of thoughts, one that will recommend going for stored procedure and the second will propose no stored procedures approach. I tried to understand the difference between the two approaches.

What is a stored procedure for a database? It is same as any procedure you create in your coding language, that takes some arguments, apply some logic, execute and returns back a result (if required).

As I already mentioned the debate is really old, so instead of reinventing the wheel, I will borrow the pointers from different sources and then try to add my thoughts to that

Stored procedures are secure: Stored procedure are secured against SQL injections. Are they totally safe? Isn’t it again the responsibility of person creating the procedure to make sure it is safe. Again, if I am writing queries in application layer (code), I can make sure the code is protected against the attacks.

Separation of data logic from business logic: All your queries which are interacting with database are residing over the database engine itself. This clearly separates all database interactions and hence make design more cleaner. One alternative will be to divide yourapplication layer into different parts, and keep data handling part completely separate from the rest.

A Disadvantage: Moving stored procedure from one platform to another can be tough, eg you decided to move from MSSQL to MYSQL, you may have recode all your stored porcedures. But this has a related advantage that if the front end language is changing, say you are moving from php to java, and your database is not changing then you don???t need to change the stored procedures.

Testing the app: Testing application with stored procedures can be a pain. Yes, as the changes are to be made in multiple places, your code as well as stored procedure. Now if database is handled by a different team, every time for a small change you have to communicate the changes and then test. This will take longer time than making changes just in your code as multiple teams are involved.

Stored procedure is faster: now that is debatable statement. I found this thing mentioned at many places that stored procedures as faster as the procedure is part of database engine itself and hence have faster access to database. Also you don’t need to send the complete query but just the procedure name. But as per my exp, I never faced any differences in terms of execution time with stored procedure.

Having considered all the points above, my personal opinion would be not to use the stored procedures especially in big application and if the same database is being used by multiple applications, because you don’t have complete control over the database and every time a change is made to a procedure it has to be communicated to multiple parties. In small applications and small teams one can use stored procedures as mostly same team has control of both database engine and code layer, so it is easier to coordinate and implement the changes.

http://codebetter.com/blogs/eric.wise/archive/2006/05/24/145393.aspx
http://www.tonymarston.net/php-mysql/stored-procedures-are-evil.html
http://codebetter.com/blogs/jeremy.miller/archive/2006/05/25/145450.aspx

Why rownum will nor work for greater or equal to checks

Someone asked a question on a forum that why rownum=2 does not work for his query, so i though of sharing some details here.

Using rownum is a tricky affair. Safest bet is to use it only when you want to limit the number of results to be shown. For example rownum<2 or rownum<=5.

Why rownum=2 or rownum>2 will not work?

Read here – http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html

In summary, this is how oracle execute a query

The FROM/WHERE clause goes first.

ROWNUM is assigned and incremented to each output row from the FROM/WHERE clause.

SELECT is applied.

GROUP BY is applied.

HAVING is applied.

ORDER BY is applied.

rownum<=2 clause will get converted to

ROWNUM = 1
for x in
( select * from emp )
loop
    exit when NOT(ROWNUM <= 2)
    OUTPUT record to temp
    ROWNUM = ROWNUM+1
end loop
SORT TEMP

if you change exit when NOT(ROWNUM <= 2) with rownnum=2, you can see it will fail in the first run itself

So if I cannot use rownum, what can I use. Try using row_number() http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions137.htm

It works something like

SELECT last_name FROM
   (SELECT last_name, ROW_NUMBER() OVER (ORDER BY last_name) R FROM employees)
   WHERE R BETWEEN 51 and 100;

Test your SQL

SQL Fiddle is the place to test your queries and database design. You can simply share the links with other team members to let them know what you are thinking. It provided a number of database platforms like Oracle, Mysql, MS Sql, Postgres etc.

Installing postgres 9

I was trying to install postgres 9 on my Ubunto 11.04 machine, but the GUI installer simply hanged. A little bit of googling provided the workaround.

Follow the steps

1. sudo apt-get install python-software-properties
2. sudo add-apt-repository ppa:pitti/postgresql
3. sudo apt-get update
4. sudo apt-get install postgresql-9.0 libpq-dev

Source
http://www.dctrwatson.com/2010/09/installing-postgresql-9-0-on-ubuntu-10-04/

Isolation levels of a database

In my last post I talked about ACID properties of the database. Out of these, Isolation is one of the most important (and a bit complex) property as this makes sure that no “dirty-ready” occur. There can be various levels of database isolation.

1. Read Uncommitted: Lowest level of isolation, does not implement any checks, a transaction can read while other transaction is writing and hence dirty read can happen.

2. Read Committed. Reads only committed data. Handles dirty read problem. But say in same transaction, the data for a row in table is read multiple times, and if the data gets updated by some other transaction and committed, between the gap period of the two reads in this transaction, the transaction has read different data at different time intervals for same row.

e.g. select name from employee where id=10;

— do something which takes time

select name from employee where id=10;

commit;

The data can change between the two reads

3. Repeatable Reads: This solves the problem mentioned above with the Read Committed. That is, this level does not allow any modifications or deletion in the data which is being accessed by some other transaction. So in last example, if we read the same data multiple times, it is guaranteed to return the same info. Though we cannot update or delete the data in table being accessed by a transaction, we can still add more data. So if the query was

select name from employee;

instead of select name from employee where id=10;

we still could have different data set.

4. Serializable: This is the highest level of restriction added on transactions. This states, that no updation, deletion or addition can happen on a dataset/ table which is being accessed by another transaction. This will solve the problem which is mentioned in previous example. But the solution comes at a high cost as this restriction level will slow down the rate of transactions being executed.

ACID test for your database

ACID is a set of properties that each database system needs to guarantee in order to make sure that all the transactions will be processed reliably

Atomicity: A transaction is always committed as a full. That means if the transaction had 10 statements, all 10 are executed fully and committed. A partial execution of transaction will never occur.

Consistency: Database state is always stable. So if a transaction moves database from state A to B, both states are stable.

Isolation: All transactions will execute in isolation and do not interfere with one another (one transaction is updating the data and other is reading at the same time – dirty read is handled).

Durability: Once a transaction is committed, the changes will not be reverted by some hardware or software failure.

Forgot MySql password?

Forgot the root password for mysql database. Here are easy steps to recover

1. Restart MySql database without password in safe mode
/etc/init.d/mysql stop
mysqld_safe --skip-grant-tables &

2. Change root password 
mysql -u root
mysql> use mysql;
mysql> update user set password="mypassword" where User='root';
mysql> flush privileges;
mysql> quit

**please note that the password might be in encrypted form. use PASSWORD('some_pass') method 
mysql> update user set password=PASSWORD('mypassword') where User='root';

3. Restart database
/etc/init.d/mysql stop
/etc/init.d/mysql start

References: http://www.cyberciti.biz/tips/recover-mysql-root-password.html

Taking Mysql Backups

If you need to take a backup of myql database in sql format, it is pretty simple

mysqldump -u myDatabaseUsername -p Password >backup.sql

You will get the backup.sql file which you can restore anywhere

mysql – u myDatabaseUser -p -D databseName < backup.sql or simply execute file from the mysql prompt @backup.sql References: http://www.roseindia.net/mysql/mysql_backup_restore.shtml

For creating user and providing privileges
http://www.debuntu.org/how-to-create-a-mysql-database-and-set-privileges-to-a-user

How does database (oracle) joins handle filter condition

I was wondering how does database interpret join queries. When we create a join between 2 tables in Oracle, with some additional filter condition on one or both tables, will oracle join the tables first and then filter or will it filter the conditions first and then join.

For example I have 2 simple queries

Query 1:

select e.name, d.name from employee e, department d
where e.dept_id=d.id and e.salary>50000;

Query 2:

select e.name, d.name from
(select * from employee where salary>50000) e, department d
where e.dept_id=d.id;

Now which is a better option?

I posted my doubts on a couple of forums and did some googling, and this is what I found

“Both are same”

An easy way to check this out id to use “explain plan for” clause, which tells you how actually oracle interprets your query after optimization.

explain plan for
select e.name, d.name from employee e, department d
where e.dept_id=d.id and e.salary>50000;

select * from table(dbms_xplan.display());

output was same as

explain plan for
elect e.name, d.name from
(select * from employee where salary>50000) e, department d
where e.dept_id=d.id;