The definition of ORM
Let's have a look how Wikipedia defines ORM:
Object–relational mapping (ORM, O/RM, and O/R mapping tool) in computer science is a programming technique for converting data between a relational database and the heap of an object-oriented programming language.
That's actually it. But to better understand what ORM is, and especially what it isn't, let's find out what the abbrevation ORM exactly means:
- O, "Object": an object (instance of a class) in your programming language, which usually represents a row from a database
- R, "Relation": the relation defines what database row the objects (and what database columns that object properties) are related to, and what database columns that object properties
- M, "Mapping": the action of creating objects that have a relation to your database
An example of a very simple ORM
Let's assume we have a database with a users table.
The table has the following columns: id, first_name, last_name
Somewhere in your code you need to work with the user with ID 3. So we read out that user using SQL:
SELECT id, first_name, last_name FROM users WHERE id = 3
You can execute that query using your programming language's SQL extension. For example, in PHP you probably will
use PDO.
The database extension of your database doesn't know about any database objects. So you will receive a map object like this:
map{
["id"]=>
int(3)
["first_name"]=>
string(3) "Foo"
["last_name"]=>
string(3) "Bar"
}
Now you could work with that map object, pass it everywhere in your business logic where you need it.
But this will sooner or later become a problem because that map is not well defined.
In PHP map objects are simply arrays. So if you have a function that expects a user, you would need to define it like this:
<?php function greet(array $user) { ... }
The argument has the array type. In other programming languages this will probably some kind of map type. So the compiler makes sure you pass an array or a map. But it doesn't make sure that it's actually an user array/map that contains an ID, first name and last name.
So how to avoid that?
Right after reading out the object from the database we map it to an object:
<?php
class User {
private int $id;
private string $firstName;
private string $lastName;
public function getId(): int
{
return $this->id;
}
public function getFirstName(): string
{
return $this->firstName;
}
public function getLastName(): string
{
return $this->lastName;
}
public static function fromRaw(array $raw): self
{
$user = new self();
$user->id = $raw['id'];
$user->firstName = $raw['first_name'];
$user->lastName = $raw['last_name'];
return $user;
}
}
$raw = fetch('SELECT id, first_name, last_name FROM users WHERE id = 3');
$user = User::fromRaw($raw)
You can now rewrite your function like this:
<?php function greet(User $user) { ... }
This is much safer because everywhere you want work with a user, you pass a User object. The compiler will tell you if you pass something else by mistake.
You know what we just did here, right?
Right?
Correct!
We just implemented an ORM.
We build some mapping logic, that maps objects to the related database data.
Does our ORM support collections (multiple users)? No.
Does our ORM handle a relationships? No.
Is our ORM even capable of saving data? No.
Is there room for improvement? For sure.
Still, it's ORM.
For the sake of completeness, here is an example how to create a collection of users:
<?php
class UserCollection {
/**
* @var array<User>
*/
private array $users = [];
/**
* @return @array<User>
*/
public function getUsers(): array
{
return $this->id;
}
public static function fromRaw(array $raw): self
{
$collection = new UserCollection();
foreach ($raw as $userRaw) {
$collection->users[] = User::fromRaw($userRaw);
}
return $collection;
}
}
$raw = fetch('SELECT id, first_name, last_name FROM users WHERE id IN (3, 4)');
$users = UserCollection::fromRaw($raw);
Saving & Reading data
After you implemented an ORM you might want to normalize the way data can be retrieved and saved.
You probably don't want to repeat the SQL query to find a user by ID at different places in your business logic.
An obvious place to implement data retrieval is in the corresponding object. You probably have seen something similar to this in many ORM libraries. Like Ruby on Rails or Laravel's Eloquent.
<?php
class User extends ActiveRecord {
private int $id;
private string $firstName;
private string $lastName;
...
public static function findById(int $id): User
{
$raw = $this->fetch('SELECT id, first_name, last_name FROM users WHERE id = ?', $id);
return self::fromRaw($raw);
}
}
$user = User::findById($id)
This is convenient, because the same class is used for both data retrieval and relation representation.
Additionally the class could be extended by a save() method to insert or update database rows:
<?php
class User extends ActiveRecord {
private int $id;
private string $firstName;
private string $lastName;
...
public function save(): void
{
$this->execute('INSERT INTO users (id, first_name, last_name) VALUES (?, ?, ?) ON DUPLICATE UPDATE', $this->id, $this->firstName, $this->lastName);
}
}
$user = User::findById($id);
$user->firstName = 'Guy';
$user->save();
If the objects that represent your database's data are also able to read out and modify (insert, update, delete) your data, it's considered an Active Record.
This is a specific design pattern.
Yes, Active Records are an implementation of an ORM. But not every ORM is an Active Record.
I've created another blog post about the problems regarding Active Records:
Active Record Pattern.
To make it short: There's probably no issues with Active Records in smaller projects, but in mid size or enterprise projects you may don't allow all the business logic you pass
a User object to, to also read out more or modify data existing data.
Because of this violation of separation of concerns and the single responsibility principle Active Records are often disliked.
But this is no reason to dislike ORMs in general - because, again, ORms are not (necessarily) Active Records.
So, what's the alternative to implementing the saving and reading logic withing the same class?
Yep. To implement the saving and reading logic in some other class.
In many ORM libraries this class is called a repository.
<?php
class UserRepository
{
public function __construct(private readonly Database $database) {}
...
public function findUserById(int $id): User
{
$raw = $this->fetch('SELECT id, first_name, last_name FROM users WHERE id = ?', $id);
return User::fromRaw($raw);
}
public function findAll(): UserCollection
{
$raw = $this->fetch('SELECT id, first_name, last_name FROM users');
return UserCollection::fromRaw($raw);
}
}
$database = ...;
$repository = new UserRepository($database);
$user = $repository->findById(3);
$users = $repository->findAll();
Using this approach, all your business logic that retrieves a User object won't be able to use the same class or object to read or manipulate data.
Still, you have one single class that is responsible to read users from your database: UserRepository.
And, by the way, do you see how it's possible that user repository contains SQL code? This is totally legit.
Critics often claim that ORMs are slow, so instead they prefer to write their own, fast and custom SQL queries.
But: One does not preclude the other.
Sure, many ORM libraries ship with a bunch of query methods that are quite common - like findById, findAll etc. But usually
they don't force you to do so.
What about saving or deleting data?
Well, you could still use Repository class and implement a delete and save method in there.
But you may also want to keep reading data and manipulating data separated. In that case you could create an EntityManager
<?php
class UserEntityManager {
public function __construct(private readonly Database $database) {}
...
public function save(User $user): id
{
$this->execute('INSERT INTO users (id, first_name, last_name) VALUES (?, ?, ?) ON DUPLICATE UPDATE', $user->getId(), $this->getFirstName(), $this->getLastName());
}
$databae = ...;
$repository = new UserRepository($database);
$entityManager = new UserEntityManager($database);
$user = $repository->findById(3);
$user->firstName = 'Dude';
$entityManager->save($user);
While we are at it... you may want to have your User class to be DTO, a stupid data transfer object with no business logic in it at all.
In that case, you should move the mapping logic implemented in the fromRaw method into another class:
<?php
<?php
class User
{
private int $id;
private string $firstName;
private string $lastName;
public function __construct(
private int $id,
private string $firstName,
private string $lastName
) {}
public function getId(): int
{
return $this->id;
}
public function getFirstName(): string
{
return $this->firstName;
}
public function getLastName(): string
{
return $this->lastName;
}
}
class UserMapper
{
public function map(array $raw): User
{
$user = new User();
$user->id = $raw['id'];
$user->firstName = $raw['first_name'];
$user->lastName = $raw['last_name'];
return $user;
}
}
class UserRepository
{
public function __construct(
private readonly Database $database,
private readonly UserMapper $mapper
) {}
...
public function findUserById(int $id): User
{
$raw = $this->fetch('SELECT id, first_name, last_name FROM users WHERE id = ?', $id);
return $this->mapper->map($raw);
}
...
}
You then have to inject your UserMapper into your UserRepository, so the repository is able to not only retrieve the raw data from the database but also create instances of User objects.
You probably don't want to implement all this by yourself.
There are great libraries out there which do this for you. A good example is PHP's Doctrine or Java's Hibernate.
For Go, you could use GORM, but you might have to implement the repositories yourself.
Summary
I hope I could convince you that ORMs are not only a good idea, no, they are necessary if your software reaches a certain size.
You have to make sure that the data you work with are in a specific structure - therefore you have to use ORMs to some degree.
You may don't want to integrate a huge ORM library. You may start with something simple, probably you could write your own ORM. But keep in mind that often software doesn't stay small and simple - it grows. And it's not unlikely that it will grow to a size where it's beneficial to use a reliable, fully tested ORM instead of your own custom solution.
ORMs don't have to be slow. They don't have to be complex. They don't have to violate well-proven software paradigms.
ORMs are great.