Writing database portable PHP applications

Published on 12.12.2011, by Lubos Dzurik

Writing database cross-compatible SQL queries for portable web applications is difficult. In fact, each modern relational database system (RDBMS) introduces its own "specialties" and proprietary SQL syntax that aims to provide the best perfomance and functionalities for its own database engine.

Unfortunatelly this also means loss of the portability of the application, once one starts implementing such a particular syntax.

In this article I will point out basic differences for defining a simple SQL table named "im_inventory" - differences in definitions of indexes, primary keys, autoincremental columns and column quotation.

Table definition - mySQL

DROP TABLE IF EXISTS `im_inventory`;

CREATE TABLE IF NOT EXISTS `im_inventory` (
  `id` int(10) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `accessory` text,
  `fk_category_id` int(11) DEFAULT NULL,
  `date_enlist` int(11) DEFAULT 0,
  `price_enlist` float DEFAULT 0,
  `fk_currency_code` varchar(3) DEFAULT NULL,
  `created_date` int(11) DEFAULT NULL,
  `fk_created_user_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `fk_category_id` (`fk_category_id`),
  KEY `fk_currency_code` (`fk_currency_code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Notes on mySQL:
  • mySQL uses backtick [`] for quoting the column names
  • the column `name` must be quoted to avoid conflict with reserved keyword
  • keys (primary, unique, ..) are defined at the end of the table
  • character set and collation can be included individually for each table
  • mySQL supports SQL constructs LIMIT, OFFSET in SELECT query.

Table definition - Microsoft SQL Server (T-SQL)

if object_id('dbo.im_inventory') is not null
begin
  drop table [dbo].[im_inventory]
end

CREATE TABLE [im_inventory] (
  [id] int IDENTITY (1,1) NOT NULL,
  [name] nvarchar(255) NULL,
  [accessory] nvarchar(4000) NULL,
  [fk_category_id] int NULL,
  [date_enlist] int DEFAULT 0,
  [price_enlist] float DEFAULT 0,
  [fk_currency_code] varchar(3) NULL,
  [created_date] int NULL,
  [fk_created_user_id] int NULL,
  PRIMARY KEY ([id])
);

create index [fk_category_id] on [im_inventory] ([fk_category_id]);
create index [fk_currency_code] on [im_inventory] ([fk_currency_code]);

Notes on MSSQL (T-SQL):
  • MSSQL uses square brackets [column] or double quotes "column" for quoting the column names
  • MSSQL does not directly support SQL construct LIMIT, OFFSET in SELECT query. It must be indirectly emulated via negated subselection. Surprisingly, even such a complex queries are quite fast.
  • incremental sequences are introduced via object IDENTITY(start, step).
  • for storing multibyte characters one should use NVARCHAR instead of VARCHAR.
  • There is no NTEXT column type, only TEXT for single byte characters. This causes however different behaviours in particular multibyte queries.
  • Personally I consider MSSQL for second in row least flexible RDBMS.

Table definition - PostgreSQL


DROP TABLE IF EXISTS "im_inventory";

CREATE TABLE "im_inventory" (
  "id" serial,
  "name" varchar(255),
  "accessory" text,
  "fk_category_id" integer,
  "date_enlist" integer DEFAULT 0,
  "price_enlist" numeric DEFAULT 0,
  "fk_currency_code" varchar(3),
  "created_date" integer,
  "fk_created_user_id" integer,
  CONSTRAINT "pk_inventory_id" PRIMARY KEY ("id")
);

create index "idx_inventory_category_id" on "im_inventory"("fk_category_id");
create index "idx_inventory_currency_code" on "im_inventory"("fk_currency_code");

Notes on PostgreSQL:
  • PostgreSQL is nicely flexible from some point of view. Introducing a complex column type "serial" is a big plus.
  • Quotation character is double quotes ["].
  • Syntax familiar to mySQL and SQLite.
  • Uses "numeric" instead of "float" (mySQL, MSSQL).

Table definition - Oracle (PL/SQL)


drop table "im_inventory" cascade constraints
/
drop sequence seq_im_inventory_id
/

create table "im_inventory" (
  "id" 					number not null,
  "name" 				varchar2(255) not null,
  "accessory" 			varchar2(2000),
  "fk_category_id" 		number,
  "date_enlist" 		number default 0,
  "price_enlist" 		number default 0,
  "fk_currency_code" 	varchar2(3),
  "created_date" 		number,
  "fk_created_user_id" 	number
)
/

alter table "im_inventory"
add constraint pk_im_inventory_id primary key ("id")
/

create sequence seq_im_inventory_id
  increment by 1
  start with 1
  minvalue 1
  maxvalue 999999999999999999999999999
  nocycle
  order
  cache 20
/

create or replace trigger trg_b_i_im_inventory
 before
  insert
 on "im_inventory"
referencing new as new old as old
 for each row
begin
     if :new."id" is null then
        select seq_im_inventory_id.nextval
          into :new."id"
          from dual;
     end if;
end;
/

create index  idx_inventory_category_id on  "im_inventory" ("fk_category_id")
/
create index  idx_inventory_currency_code on  "im_inventory" ("fk_currency_code")
/

Notes on Oracle:
  • Oracle is definitely the most difficult RDBMS because it follows its own logic of implementation.
  • Quotation character is double quotes ["].
  • There are no autoincremental types - you must define sequence, then event trigger and within the trigger assign incremented value. This is definitely not intuitive for a no-oracle specialist:-(
  • Oracle similarly to MSSQL does not directly support construct OFFSET, LIMIT in SELECT clauses. It must be emulated via SQL subselection.
  • A lot of settings (NLS_* attributes) can be defined for each individual HTTP request (= 1 Oracle session). These may influence search results (e.g. ALTER SESSION SET NLS_SORT=BINARY_AI), set formatting properties (e.g. ALTER SESSION SET NLS_NUMERIC_CHARACTERS = '. '), collation properties etc.
  • The worst behaviour however is oracle's internal default conversion of unquoted column and table names into UPPERCASE characters. For case sensitive applications (like PHP) this means requirement to either define all table and column names in uppercased characters when using no quoted columns or obligatory quoting column names.
  • Oracle does not recognise type TEXT. VARCHAR(4000) is often (almost) sufficient replacement. Otherwise CLOB/NCLOB can be used with however different behaviour than VARCHAR/NVARCHAR, naturally. See Oracle data types.
  • Instead of integer and float/numeric it uses the "number" type.

Table definition - SQLite


DROP TABLE IF EXISTS "im_inventory";

CREATE TABLE "im_inventory" (
  "id" INTEGER PRIMARY KEY,
  "name" varchar(255) ,
  "accessory" text,
  "fk_category_id" integer,
  "date_enlist" integer DEFAULT 0,
  "price_enlist" numeric DEFAULT 0,
  "fk_currency_code" varchar(3),
  "created_date" integer,
  "fk_created_user_id" integer
);

create index "idx_inventory_category_id" on "im_inventory"("fk_category_id");
create index "idx_inventory_currency_code" on "im_inventory"("fk_currency_code");

Notes on SQLite:
  • SQLite has syntax familiar and pretty close to mySQL and PostgreSQL.
  • Quotation character is double quotes ["].
  • Uses numeric instead of float (mySQL, MSSQL).
  • SQLite has very limited security possibilities.
  • SQLite is not a real compatitor to other RDBMS regarding robustness, features and performance- it is here just fill compare syntax differences / similarities.

Conclusion - How to write database cross-compatible PHP applications

  • When trying to write portable web aplications, one should test it on the most difficult RDBMS - which is Oracle and MSSQL. Often, parts of the application needs to be rewritten when testing application on various databases. This significantly increases development costs. Also setting up environment for particular database in order to run unit tests is sometime pain (particularly Oracle).
  • Portable applications must unfortunatelly sacrifice benefits of a particular RDBMS's advanced functions. Never use view, triggers etc since this is almost never portable code.
  • Particularly, one should not use any DATE/TIME functions since these have almost zero portability - each RDBMS stores date in its own incompatible internal format. The only pure way is storing date as an unix integer timestamp in column of type integer (number). It enables fast sorting and PHP has reach set of date / time handling functions. For higher precision one may consider using float (numeric) column type.

    Note: As for PHP version bellow 5.1 on Windows - be careful with handling timestamps before 01.01.1970 due to known bug. Negative timestamps are supported on Windows only since PHP 5.1.

  • SQL queries should always be as simple as possible - ASAP:-) and stick with the most basic SQL-92 syntax. Script itself should do the rest of the job.

Leave your comment..
Email will be converted into something like [michael AT gmail DOT com]
Note: Offensive and unrelated comments will be deleted.
Please enter result from the picture above.