1 Introduction
In Haskell, we have many database libraries. Persistent is a layer atop, using some of them as its backends, that imposes type safety and seamless integration with Haskell code. With persistent-odbc many more DBMSs can be attached. Own drivers can be written. Whenever Persistent turns out to be insufficient, we can fall back to the raw driver or its Haskell wrapper, and write a Persistent ORM ourselves. The framework is wonderful, but as everything has its caveats. We will discuss some them in the course of this series of posts here.
In other words, Persistent is a seamless marshalling or serialization library that removes most boilerplate code1.
This marshalling or object-relational mapping (ORM) imposes type-safety onto the conventional untyped database drivers, which may be written in an arcane dialect of C (macros!), such as the pgsql driver, notorious for its undocumented features. We thereby can employ Haskell’s type system and GHC’s facilities to ensure that we don’t make mistakes in our queries and that the data remains consistent throughout our data transformations. The key point here is that any errors that might occur are pulled back from runtime to compile-time, where we can fix it proactively. In particular, type-safety prevents SQL injection attacks.
There are basically four different approaches to dealing with databases in Haskell:
use the existing drivers provided as backends and create a layer atop that imposes Haskell types on the level of Haskell code — this is the approach of serialization, marshalling, or ORM taken by Persistent and Esqueleto (more later), and many others listed below;
use the existing drivers via foreign-function interfaces (FFIs) by introducing a thin wrapper around them, and run direct textual queries, hoping for consistency — this is what
sqlite-simpleandpostgresql-simpleamong others do (“mid-level client libraries”);run third-party database drivers as subprocesses — this is very crude but may turn out to be necessary in some situations;
rewrite the existing driver directly in Haskell, which enables us to introduce category-theoretic abstractions that can simplify code and improve performance — this is what Nikita Volkov’s
hasqllibrary for Postgres does.
Some other persistence and ORM-like libraries:
- Opaleye
- Beam
- Groundhog
- Selda
- Squeal
- Triggers for SQL
- Persistent-ODBC
- Persistent-Zookeeper
vault(a persistent storage of values of arbitrary types)TCache
2 Limitations in Persistent that can be overcome with Esqueleto
So when do we use which of these two libraries?
- Persistent provides type-safe serialization of data. It allows for type-safe filtered storage and retrieval of data from a database (remote-resource, local resource, file, or in-memory databases). It does not support type-safe
JOINs, but we can fall back to explicit-string queries. - Esqueleto: provides an embedded domain-specific language (EDSL) for SQL and superposes type-safe
JOINs atop of Persistent SQL backends. As of writing this, Esqueleto supports only type-safeSELECT,UPDATE,INSERT, andDELETEqueries. It does not provide for all SQL features. So we can either implement them by hand whenever necessary, which is relatively easy, or switch to the fall-back mode of explicit-string queries.
We can use both libraries in the same module, in which case we would have to import at least one of them qualified, so as to avoid identifier conflicts.2
-- For a module that mostly uses esqueleto.
import Database.Esqueleto
import qualified Database.Persistent as Por import esqueleto itself qualified:
-- For a module that uses esqueleto just on some queries.
import Database.Persistent
import qualified Database.Esqueleto as EOur plan is to first learn to use Persistent, and then extend it with Esqueleto.
3 Fundamental Definitions
Here is the fundamental correspondence3:
| SQL DBMS | Persistent |
|---|---|
| Data types | PersistValue |
| Column | PersistField |
| Table | PersistEntity |
Also consider this migrations (conversions) table for marshalling4:
| Haskell | PostgreSQL | MySQL | MongoDB | SQLite |
|---|---|---|---|---|
Text | |
VARCHAR |
TEXT |
String |
VARCHAR |
ByteString |
BYTEA |
BLOB |
BinData |
BLOB |
Int |
INT8 |
BIGINT(20) |
NumberLong |
INTEGER |
Double |
DOUBLE PRECISION |
DOUBLE |
Double |
REAL |
Rational |
NUMERIC(22,12) |
DECIMAL(32,20) |
unsupported | NUMERIC(32,20) |
Bool |
BOOLEAN |
TINYINT(1) |
Boolean |
BOOLEAN |
Day |
DATE |
DATE |
NumberLong |
DATE |
TimeOfDay |
TIME |
TIME5 |
unsupported | TIME |
UTCTime6 |
TIMESTAMP |
DATETIME7 |
Date |
TIMESTAMP |
And, in particular, for MySQL and MariaDB:
| Haskell type | Compatible MySQL types |
|---|---|
Bool |
Tiny |
Int8 |
Tiny |
Int16 |
Tiny, Short |
Int32 |
Tiny, Short, Int24, Long |
Int |
Tiny, Short, Int24, Long, LongLong8 |
Int64 |
Tiny, Short, Int24, Long, LongLong |
Integer |
Tiny, Short, Int24, Long, LongLong |
Word8 |
Tiny |
Word16 |
Tiny, Short |
Word32 |
Tiny, Short, Int24, Long |
Word64 |
Tiny, Short, Int24, Long, LongLong |
Double |
Float, Double, Decimal, NewDecimal, Tiny, Short, Int24, Long |
Ratio Integer |
Float, Double, Decimal, NewDecimal, Tiny, Short, Int24, Long, LongLong |
ByteString |
VarChar, TinyBlob, MediumBlob, LongBlob, Blob, VarString, String, Set, Enum |
Lazy.ByteString |
VarChar, TinyBlob, MediumBlob, LongBlob, Blob, VarString, String, Set, Enum |
Encoding.Text9 |
VarChar, TinyBlob, MediumBlob, LongBlob, Blob, VarString, String, Set, Enum |
Lazy.Text |
VarChar, TinyBlob, MediumBlob, LongBlob, Blob, VarString, String, Set, Enum |
[Char], String |
VarChar, TinyBlob, MediumBlob, LongBlob, Blob, VarString, String, Set, Enum |
UTCTime |
DateTime,Timestamp |
Day |
Year, Date, NewDate |
TimeOfDay |
Time |
As of writing, there is no support for Word, Float, or Scientific yet.
4 Entity Syntax
“The default resolution for
TIMEandDATETIMEin MySQL is one second. As of MySQL version 5.6.4, and persistent-mysql-2.6.2, fractional seconds are handled correctly if you declare an explicit precision by usingsqltype. For example, appendingsqltype=TIME(6)to aTimeOfDayfield definition will give microsecond resolution.”↩︎“Support for
ZonedTimewas dropped in persistent 2.0.UTCTimecan be used withtimestamp without timezoneandtimestamp with timezonein PostgreSQL. See also the section below about timezone support.”↩︎“The default resolution for
TIMEandDATETIMEin MySQL is one second. As of MySQL version 5.6.4, and persistent-mysql-2.6.2, fractional seconds are handled correctly if you declare an explicit precision by usingsqltype. For example, appendingsqltype=TIME(6)to aTimeOfDayfield definition will give microsecond resolution.”↩︎“When
Wordsize is 64bit.”↩︎“Utf8 only.”↩︎