FOSDEM PGDay 2019 - Speaker Interview - Damien Clochard

Speaker Interview: Damien Clochard

Anonymization and Data Masking with PostgreSQL Friday 09:20 Hotel

Twitter: @daamien Blog: blog.taadeem.net LinkedIn: @damienclochard Company website: dalibo.com Other: postgresql_anonymizer

Could you briefly introduce yourself?

My name is Damien Clochard. Since 2005, I work for Dalibo, a worker-owned cooperative dedicated to PostgreSQL in France. I’ve had held positions in the company along the years, at the moment I am a DBA doing mostly Postgres integration, support and trainings...

How do you engage with the PostgreSQL Community?

I’m involved in the PostgreSQL community at various levels: I’m part the admin team behind the www.postgresql.fr platform. A few years ago, I launched a media called “PostgreSQL Magazine”. I’m also president of the french speaking PostgreSQL users association and one the organizer of the PG Day France conference.

Have you enjoyed previous FOSDEM conferences, either as attendee or as speaker?

FOSDEM is probably my favorite open source event! I really like to see of all these different free software communities gathering in one place for a few days. All these passionate people from around the world sharing more than source code…. It makes me feel small and important at the same time. When I came here for the first time 12 years ago, I was amazed by the vibe of the event. Even now, it never ceases to impress me.

What will your talk be about, exactly? Why this topic?

I’m going to talk about anonymization and data masking. Over the last 2 years, we’ve seen growing concerns about protection of personal data… There’s the GDPR of course but it goes beyond legal obligations. I think free software communities must lead the way to build a future where privacy and anonymity are available to everyone. And of course PostgreSQL has an important role to in this domain because it’s by far the wolrd’s most dynamic and innovative database system.

Personally I think that PostgreSQL must evolve from being a simple data storage engine to a data protection platform with an emphasis on security features like encryption or row level security policies… In that regard, data anonymization is an old topic but it’s also a rather unexplored area.

Last year I started a project called called “PostgreSQL Anonymizer” which is basically a PoC to show why we should write anonymization and masking rules directly using the SQL syntax....

What is the audience for your talk?

In most organization, the anonymization of sensible information is a task assigned to database administrators (DBA). So my talk is oriented to every DBA who ever tried to remove personal info from a dataset...

However I’m convinced that the anonymization policy of a dataset should be defined at the early development stages of every applications. It is a design task, just like choosing data types, adding indexes, defining integrity constraints, creating foreign keys, etc…. So my talk is also aimed at developers to convince them that they have the responsibility to describe how their datasets must be anonymized….

What is the one feature in PostgreSQL 11 which you like most?

PostgreSQL 11 introduced built-in binary string functions such as sha256(), sha512(), etc. It’s not the most spectacular feature because they were already available in the pgcrypto extension. But implementing this directly inside the Postgres core and make them accessible to all users is a great move!

Which feature would you like to see in PostgreSQL?

I’m on a mission to convince people that we need to extend the SQL syntax in order to define anonymization policies directly with the DDL language. Something like:

ALTER TABLE users ALTER COLUMN birth MASK WITH FUNCTION random_date();

It’s easier said than done, obviously. The road is long but my talk is small step toward this goal!

FOSDEM is a very large conference. Are there any other talks you want to see? Where will people usually find you?

I usually spend a lot of time in the Open Source Design devroom (link below) which is always fun and insightful… And of course you’ll find around the PostgreSQL devroom!

https://fosdem.org/2019/schedule/track/open_source_design/