1 Introduction
In Haskell, we have many database libraries. Persistent is a layer atop, using some of them as its backends, that imposes type safety and seamless integration with Haskell code. With persistent-odbc
many more DBMSs can be attached. Own drivers can be written. Whenever Persistent turns out to be insufficient, we can fall back to the raw driver or its Haskell wrapper, and write a Persistent ORM ourselves. The framework is wonderful, but as everything has its caveats. We will discuss some them in the course of this series of posts here.
In other words, Persistent is a seamless marshalling or serialization library that removes most boilerplate code1.
This marshalling or object-relational mapping (ORM) imposes type-safety onto the conventional untyped database drivers, which may be written in an arcane dialect of C (macros!), such as the pgsql
driver, notorious for its undocumented features. We thereby can employ Haskell’s type system and GHC’s facilities to ensure that we don’t make mistakes in our queries and that the data remains consistent throughout our data transformations. The key point here is that any errors that might occur are pulled back from runtime to compile-time, where we can fix it proactively. In particular, type-safety prevents SQL injection attacks.
There are basically four different approaches to dealing with databases in Haskell:
use the existing drivers provided as backends and create a layer atop that imposes Haskell types on the level of Haskell code — this is the approach of serialization, marshalling, or ORM taken by Persistent and Esqueleto (more later), and many others listed below;
use the existing drivers via foreign-function interfaces (FFIs) by introducing a thin wrapper around them, and run direct textual queries, hoping for consistency — this is what
sqlite-simple
andpostgresql-simple
among others do (“mid-level client libraries”);run third-party database drivers as subprocesses — this is very crude but may turn out to be necessary in some situations;
rewrite the existing driver directly in Haskell, which enables us to introduce category-theoretic abstractions that can simplify code and improve performance — this is what Nikita Volkov’s
hasql
library for Postgres does.
Some other persistence and ORM-like libraries:
- Opaleye
- Beam
- Groundhog
- Selda
- Squeal
- Triggers for SQL
- Persistent-ODBC
- Persistent-Zookeeper
vault
(a persistent storage of values of arbitrary types)TCache
2 Limitations in Persistent that can be overcome with Esqueleto
So when do we use which of these two libraries?
- Persistent provides type-safe serialization of data. It allows for type-safe filtered storage and retrieval of data from a database (remote-resource, local resource, file, or in-memory databases). It does not support type-safe
JOIN
s, but we can fall back to explicit-string queries. - Esqueleto: provides an embedded domain-specific language (EDSL) for SQL and superposes type-safe
JOIN
s atop of Persistent SQL backends. As of writing this, Esqueleto supports only type-safeSELECT
,UPDATE
,INSERT
, andDELETE
queries. It does not provide for all SQL features. So we can either implement them by hand whenever necessary, which is relatively easy, or switch to the fall-back mode of explicit-string queries.
We can use both libraries in the same module, in which case we would have to import at least one of them qualified, so as to avoid identifier conflicts.2
-- For a module that mostly uses esqueleto.
import Database.Esqueleto
import qualified Database.Persistent as P
or import esqueleto
itself qualified:
-- For a module that uses esqueleto just on some queries.
import Database.Persistent
import qualified Database.Esqueleto as E
Our plan is to first learn to use Persistent, and then extend it with Esqueleto.
3 Fundamental Definitions
Here is the fundamental correspondence3:
SQL DBMS | Persistent |
---|---|
Data types | PersistValue |
Column | PersistField |
Table | PersistEntity |
Also consider this migrations (conversions) table for marshalling4:
Haskell | PostgreSQL | MySQL | MongoDB | SQLite |
---|---|---|---|---|
Text | |
VARCHAR |
TEXT |
String |
VARCHAR |
ByteString |
BYTEA |
BLOB |
BinData |
BLOB |
Int |
INT8 |
BIGINT(20) |
NumberLong |
INTEGER |
Double |
DOUBLE PRECISION |
DOUBLE |
Double |
REAL |
Rational |
NUMERIC(22,12) |
DECIMAL(32,20) |
unsupported | NUMERIC(32,20) |
Bool |
BOOLEAN |
TINYINT(1) |
Boolean |
BOOLEAN |
Day |
DATE |
DATE |
NumberLong |
DATE |
TimeOfDay |
TIME |
TIME 5 |
unsupported | TIME |
UTCTime 6 |
TIMESTAMP |
DATETIME 7 |
Date |
TIMESTAMP |
And, in particular, for MySQL and MariaDB:
Haskell type | Compatible MySQL types |
---|---|
Bool |
Tiny |
Int8 |
Tiny |
Int16 |
Tiny , Short |
Int32 |
Tiny , Short , Int24 , Long |
Int |
Tiny , Short , Int24 , Long , LongLong 8 |
Int64 |
Tiny , Short , Int24 , Long , LongLong |
Integer |
Tiny , Short , Int24 , Long , LongLong |
Word8 |
Tiny |
Word16 |
Tiny , Short |
Word32 |
Tiny , Short , Int24 , Long |
Word64 |
Tiny , Short , Int24 , Long , LongLong |
Double |
Float , Double , Decimal , NewDecimal , Tiny , Short , Int24 , Long |
Ratio Integer |
Float , Double , Decimal , NewDecimal , Tiny , Short , Int24 , Long , LongLong |
ByteString |
VarChar , TinyBlob , MediumBlob , LongBlob , Blob , VarString , String , Set , Enum |
Lazy.ByteString |
VarChar , TinyBlob , MediumBlob , LongBlob , Blob , VarString , String , Set , Enum |
Encoding.Text 9 |
VarChar , TinyBlob , MediumBlob , LongBlob , Blob , VarString , String , Set , Enum |
Lazy.Text |
VarChar , TinyBlob , MediumBlob , LongBlob , Blob , VarString , String , Set , Enum |
[Char] , String |
VarChar , TinyBlob , MediumBlob , LongBlob , Blob , VarString , String , Set , Enum |
UTCTime |
DateTime ,Timestamp |
Day |
Year , Date , NewDate |
TimeOfDay |
Time |
As of writing, there is no support for Word
, Float
, or Scientific
yet.
4 Entity Syntax
“The default resolution for
TIME
andDATETIME
in MySQL is one second. As of MySQL version 5.6.4, and persistent-mysql-2.6.2, fractional seconds are handled correctly if you declare an explicit precision by usingsqltype
. For example, appendingsqltype=TIME(6)
to aTimeOfDay
field definition will give microsecond resolution.”↩︎“Support for
ZonedTime
was dropped in persistent 2.0.UTCTime
can be used withtimestamp without timezone
andtimestamp with timezone
in PostgreSQL. See also the section below about timezone support.”↩︎“The default resolution for
TIME
andDATETIME
in MySQL is one second. As of MySQL version 5.6.4, and persistent-mysql-2.6.2, fractional seconds are handled correctly if you declare an explicit precision by usingsqltype
. For example, appendingsqltype=TIME(6)
to aTimeOfDay
field definition will give microsecond resolution.”↩︎“When
Word
size is 64bit.”↩︎“Utf8 only.”↩︎