Google+ Followers

zaterdag 24 januari 2015

Why I like PostgreSQL (more than MySQL): memcache

Memcache is a fast in-memory key/value store. It is often used as a cache for data that is slow to generate and that needs to be served often. Given that it's usually the database that is the "slowest" part of the system (I use quotes because it the database is usually also the part of the system that does most of the work, people tend to forget that in their hunt for "performance". I use quotes here because the word performance is very relative and incudes much more than getting a reply quickly)

But, saving database results anywhere other than the database creates an instant race condition. The data in memcached is must be considered outdated the moment it is stored, because from that moment on the data will not change when the source changes.

Enter PgMemcached.

PgMemcached is a loadable extension for PostgreSQL that enables access to a memcached server from within a query. It allows you to do something like:

SELECT memcache_server_add('127.0.0.1:11211');
SELECT memcache_set('foo','bar');
SELECT memcache_get('foo');

Which will return 'bar'.

But because it is a regular memcached server, you can now do this:

> telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
> get foo
VALUE foo 0 3
bar
END

So, PostgreSQL can now read and write directly to your memcached server and share the data that your application is storing there. That opens up the next step:

DROP TABLE IF EXISTS tmp_news CASCADE;
CREATE TEMPORARY TABLE tmp_news (id INTEGER, title VARCHAR(100), body TEXT);
INSERT INTO tmp_news VALUES (1, 'Webber wins'),(2,'Alonso loses'),(3, 'Gingers have souls');
CREATE OR REPLACE FUNCTION update_news_cache() RETURNS TRIGGER AS 
$$
DECLARE
f TEXT;
junk RECORD;
BEGIN
WITH json_rows AS (SELECT ROW_TO_JSON(tmp_news) AS r FROM tmp_news ORDER BY id DESC LIMIT 10)
SELECT memcache_set('news_cache', JSON_AGG(r)::TEXT) FROM json_rows INTO junk;
  RETURN NEW;
END
$$
LANGUAGE PLPGSQL;

CREATE TRIGGER trg_update_news_cache AFTER INSERT ON tmp_news FOR EACH ROW EXECUTE PROCEDURE update_news_cache();

INSERT INTO tmp_news VALUES (4,'UFO sighted at Malibu');

SELECT memcache_get('news_cache');

So, adding a news item to tmp_news automaticallty updates a memcached value with a JSON array of the news items, for use in your application.
Critically, with this in place your news cache is updated regardles of how the newsitems were changed. It can be done by your application, or by a manual query, whatever changes the news will force an update of the cache. This in turn means that you don't have to check the validity of the cache anymore. if the value exists then it is up-to-date by definition (well, iunless you've changed it by other means...)